More

blacha · on July 10, 2023

Your browser has a very powerful image decoder built into it, offloading the PNG decoding into Javascript is very resource hungry.

Using maplibre (or any map viewer) you can load blobs of image data out of a tiff and use `Image` or `Canvas` to render the data onto a map.

Its even easier if the tiffs are already Cloud optimized as they perfectly align to a 1-to-1 map tile and they don't need to be rescaled, you can then just render the images onto the map. eg here is a viewer that loads webps out of a 15GB tiff and uses Canvas to render them onto a map [1]

Unless you are trying to layer all your maps together, you also could stop reprojecting them into webmercator, or if your goal is to layer them, then storing them in webmercator would save a ton of user's compute time.

There are a bunch of us that talk web maping and imagery in the #maplibre and #imagery slack channels in OSMUS's slack [2]

[1] https://blayne.chard.com/cogeotiff-web/index.html?view=cog&i...

[2] https://github.com/maplibre/maplibre-gl-js#getting-involved

Soupy · on July 10, 2023

amazing comments and callouts, thank you! I actually tried to load the raw image data blobs as a layer into MapLibre but couldn't figure out a way to do it and finally capitulated and did the "bad" move of re-encoding just to get the initial interactive map collection out the door for folks. It sounds like this is in fact possible but I missed something. I'll take a look at the Image and Canvas sources, thanks!

Re the webmercator reprojection - yea it's gnarly that I'm doing it clientside but it's exactly because I'm working towards the ability to layer them interactively on top of eachother (as well as on various basemaps). My projection code is also only half-working at the moment and it's where I'm currently spending my time next week. I'm trying to avoid building pipelines to re-encode the geotiffs as long as I can since there's 10+TB of them in my backend so this is why you're seeing me doing this clientside instead. This is a solo project so I need to be really picky where I spend my time so I can keep moving the ball forward

I'll join those 2 communities, thank you! Been crazy hard to find folks who are deep in this stuff so most of my learning has been through endless googling down deep dark corners of the web for the past 2 months

tppiotrowski · on July 10, 2023

Great points. Thank you for the links. The one trade off here is that uncompressed blobs will require longer downloads than PNG and I think usually the network transfer is slower than PNG decoding.

But maybe the sample gist takes a Tiff blob and encodes it to a PNG on the client and then maplibre decodes the PNG to canvas. That would be quite inefficient if that's what it's doing.

blacha · on July 10, 2023

Those comments were more at pastmaps.

For elevation data, we store our DEM/DSM in S3 as LERC [1] COGS, LERC has a WASM bundle which I think can be used in the browser. We found LERC COGs to be one of the most space efficient ways of storing highresolution DEM/DSM data [2], If you wanted to you could fetch LERC tiles directly out of a remote COG and use that directly for the terrain heights.

I am more focused on storage/archiving/publishing of our LiDAR capture program [3] than web based visualizations of it though, so I am unsure if a LERC COG would even be better for you than a PNG TerrainRGB.

[1] https://www.npmjs.com/package/lerc

[2] https://github.com/linz/elevation/tree/master/docs/tiff-comp...

[3] https://linz.maps.arcgis.com/apps/MapSeries/index.html?appid...

Pabloski80 · on July 10, 2023

WASM surely could be an improvement over js, especially for kind of BigData-ish/repetitive jobs, and where load on clients might become the next wall after we optimized the cloud/server part, or when we try to use js on cloud leafnodes.

blacha · on Feb 11, 2023

I don't think geojson is a great format for anything with more than a few MB of data.

I wanted to see exactly how bad it is with a largeish datasets, so I exported the New Zealand address dataset[1] with ~2.5M points as a geopackage (750MB) QGIS loads this fine its a little slow when viewing the entire country but when zoomed into city level it is almost instant to pan around.

Using ogr2ogr I converted it to ndgeojson (2.5GB), It crashed my QGIS while trying to load it. Using shuf I created a random 100,000 points geojson (~110MB) it was unbearably slow in QGIS while panning around 5+ seconds.

I currently use and recommend flatgeobuf[2] for most of my working datasets as it is super quick and doesn't need sqlite to read (eg in a browser).

It is also super easy to convert to/from with ogr2ogr

ogr2ogr -f flatgeobuf output.fgb input.geojson

[1] https://data.linz.govt.nz/layer/105689-nz-addresses/data/ [2] https://github.com/flatgeobuf/flatgeobuf

moomoo11 · on Feb 11, 2023

Can’t you use clustering techniques?

blacha · on Nov 23, 2022

if you also wanted to pull in a node framework for deserilization you could import something like `zod` to have the similar level of nice error messages when invalid input is encountered.

If you were in typescript, zod would also automatically generate your interfaces too.

blacha · on Nov 6, 2022

New Zealand regularly uses flying companies to take aerial imagery photos of most of the country, which is generally taken between 30cm and ~2cm, Land Information New Zealand (LINZ) then releases this imagery imagery completely free (CC-BY)[1,2] to the public and can be downloaded as GeoTIFFs [3]

I think LINZ decided that <0.05m might have some privacy issues due to being able to distinguish people in it, and have held back releasing some <0.05m or they may have reduced the quality of it.

disclaimer: I work at Land Information New Zealand

[1] https://basemaps.linz.govt.nz/?i=hawkes-bay-urban-2022-0.05m... [2] https://basemaps.linz.govt.nz/?i=christchurch-urban-2021-0.0... [3] https://data.linz.govt.nz/layer/106915-christchurch-005m-urb...

blacha · on April 20, 2022

are your orthomosaics geo refernced?

You could look at storing them as a cloud optimised geotiff (COG) and then adding them directly into your web map.

blacha · on Feb 16, 2022

This is basically exactly what we do we have created a cloud optimised tar (cotar)[1] by creating a hash index of the files inside the tar.

I work with serving tiled geospatial data [2] (Mapbox vector tiles) to our users as slippy maps where we serve millions of small (mostly <100KB) files to our users, our data only changes weekly so we precompute all the tiles and store them in a tar file in s3.

We compute a index for the tar file then use s3 range requests to serve the tiles to our users, this means we can generally fetch a tile from s3 with 2 (or 1 if the index is cached) requests to s3 (generally ~20-50ms).

To get full coverage of the world with map box vector tiles it is around 270M tiles and a ~90GB tar file which can be computed from open street map data [3]

> Though even that would only work with a subset of compression methods or no compression.

We compress the individual files as a work around, there are options for indexing a compressed (gzip) tar file but the benefits of a compressed tar vs compressed files are small for our use case

[1] https://github.com/linz/cotar (or wip rust version https://github.com/blacha/cotar-rs) [2] https://github.com/linz/basemaps or https://basemaps.linz.govt.nz [3] https://github.com/onthegomap/planetiler

remram · on Feb 16, 2022

Why not upload those files separately, or in ZIP format?

blacha · on Feb 16, 2022

> Why not upload those files separately,

Doing S3 put requests for 260M files every week would cost around $1300 USD/week which was too much for our budget

> or in ZIP format?

We looked at zip's but due to the way the header (well central file directory) was laid out it mean that finding a specific file inside the zip would require the system to download most of the CFD.

The zip CFD is basically a list of header entries where they vary in size of 30 bytes + file_name length, to find a specific file you have to iterate the CFD until you find the file you want.

assuming you have a smallish archive (~1 million files) the CFD for the zip would be somewhere in the order of 50MB+ (depending on filename length)

Using a hash index you know exactly where in the index you need to look for the header entry, so you can use a range request to load the header entry

  offset = hash(file_name) % slot_count

Another file format which is gaining popularity recently is PMTiles[1] which uses tree index, however it is specifically for tiled geospatial data.

[1] https://github.com/protomaps/PMTiles

klauspost · on Feb 17, 2022

Nice tools!

When it is serverside, reading a 50MB CFD is a small task. And once it is read we can store the zipindex for even faster access.

We made 'zipindex' to purposely be a sparse, compact, but still reasonably fast representation of the CFD - just enough to be able to serve the file. Typically it is around a 8:1 reduction on the CFD, but it of course depends a lot on your file names as you say (the index is zstandard compressed).

Access time from fully compressed data to a random file entry is around 100ms with 1M files. Obviously if you keep the index in memory, it is much less. This time is pretty much linear which is why we recommend aiming for 10K file per archive, which makes the impact pretty minimal.

remram · on Feb 17, 2022

You mean the cost of the PUT requests becomes significant. That makes sense since AWS doesn't charge for incoming bandwidth. Thanks!

blacha · on Jan 26, 2022

New Zealand has 8G/8G to some homes within bigish cities for around $270nzd/mo (~180usd/mo)[1] and most of the population has access to at least 1000up/500down fibre for about $100nzd/mo (~65usd/mo).

It how ever is not the cheapest country in the world, so it somewhat depends on your definition of affordable

[1] https://www.orcon.net.nz/hyperfibre/

blacha · on Jan 25, 2022

looks to be using open sea dragon [1] This is just a tiled map so any web mapping software would also work OpenLayers, Leaflet, Maplibre etc.

[1] https://openseadragon.github.io/

_joel · on Jan 25, 2022

Interesting, I've used leaflet a lot in the past (love it!) for GIS but never thought about tiling a more generic image to it. Good call.

blacha · on Nov 9, 2021

Looks to be a service to remove the HTTP referer header when linking to other sites.

Say your on example.com and click a a link to foo.com the browser will send the http header `Referer: example.com` in the HTTP get to foo.com, this means foo.com can then track how you came to their site.

gadders · on Nov 9, 2021

The only place I've even seen this is torrent sites linking to IMDB. Does it have a legitimate use?

homero · on Nov 9, 2021

Links in web email clients

programmer_dude · on Nov 9, 2021

LOL, why do you need an "accelerator" for this?

orf · on Nov 9, 2021

It’s explained in detail in the article.

mcintyre1994 · on Nov 9, 2021

It's about accelerating the networking performance from the user to a server doing the work. You don't need to do it but it's always going to be a limit on performance if you don't, no matter how fast you make the application server itself.

blacha · on Aug 1, 2021

New Zealand, with our borders closed my company has had multiple job ads open for intermediate/senior developers for over a year.

Every candidate we do interview, generally also has multiple other job offers.

slyall · on Aug 1, 2021

The immigrant supply is cutoff and there is upward pressure on wages but employers are not panicking yet. Just grumbling that usual pay is not attracting good enough candidates.

Plenty of places have not increased wages yet and interview process seems around the same.

dilyevsky · on Aug 2, 2021

+1. I’ve been hearing grumbling for a while now but as far as i can tell the offer increase has been largely symbolic. Most companies are either in denial still or planning on waiting it out