Loom Geo Dataset Editor — Azure-native parity spec¶
Reference: ADLS Gen2 + GeoJSON (RFC 7946) + Shapefile (ESRI) + GeoParquet 1.0 spec + Azure Maps Creator Dataset, captured 2026-05-26 by the catalog agent. There is no Fabric equivalent —
geo-datasetis a Loom-native item that backs bothgeo-maprendering andgeo-queryspatial joins.
What's there in the Azure-native experience¶
Storage layouts¶
- GeoJSON — single
FeatureCollectionper file; native to Azure Maps Web SDKDataSource.importDataFromUrl() - Shapefile —
.shp+.shx+.dbf+.prjquad on ADLS; converted to GeoJSON on read via OGR / GeoPandas in Spark - GeoParquet 1.0 — columnar Parquet with
geometrycolumn (WKB) andgeometadata block; native read in Databricks 17.1+ ST functions and Synapse Spark viageopandas+pyarrow - KML / KMZ / GPX / GML / WKT / CSV — supported via
atlas.io.read()(Spatial IO module) and Sparksedonapackage - Azure Maps Creator dataset — IMDF-converted indoor map, vendor-specific
Azure Maps Creator dataset blade¶
- Dataset header:
Microsoft.Maps/accounts/{name}/creators/{creator}/datasets/{datasetId}+ alias - Conversion source — pointer to the uploaded DWG/IMDF zip
- Feature collections — WFS-queryable:
unit,level,facility,opening,directoryInfo,lineElement,areaElement - Tileset — derived vector tile output
- Style + Map configuration — defaults applied when rendered
Inspector observed in raw GeoJSON/GeoParquet workflows¶
- Feature count
- Bounding box
[west, south, east, north] - Geometry type histogram (Point / LineString / Polygon / Multi* / GeometryCollection)
- Property schema (column name → type → sample value)
- SRID (default 4326 / WGS84; reproject via
st_transform) - Per-feature inspector: properties JSON + geometry preview on a mini-map
What Loom already has¶
- ✅
GeoDatasetEditorstub captures an ADLS path - ✅ Honest sample T-SQL
OPENROWSETover Parquet to inspect via existing/api/items/synapse-serverless-sql-pool/[id]/queryroute - ✅ Cosmos-backed state save through the generic item shell
What's missing for parity¶
- Format auto-detect — sniff the ADLS path (
.geojson/.parquet+geometadata /.shpquad /.kml/.kmz/.gpx) and label the dataset accordingly - Feature count + bbox — backend computes feature count and bbox from a head-scan of the file (GeoJSON streaming parse / Parquet metadata)
- Geometry type histogram — surface "12,431 Point · 84 Polygon · 0 LineString" with badges
- Property schema panel — list properties with type inference + sample values (first 10 features)
- Sample feature inspector — paginated grid of N features with
propertiesJSON + a mini Azure Maps preview of the geometry - SRID display + reproject action — read the SRID from
prj(Shapefile) orcrs(GeoJSON) orgeoblock (GeoParquet); offer "Reproject to 4326" that runsgeo-pipeline - Conversion tool — convert Shapefile or KML → GeoJSON or GeoParquet via a one-click Synapse Spark job
- Validation report — flag invalid geometries (self-intersection, wrong ring order, NaN coords); offer "Fix with
geo_polygon_simplify" action - Link to
geo-map— quick action "Open in map" creates or attaches to a siblinggeo-mapitem - Creator dataset mode — when the dataset is a Maps Creator IMDF dataset, swap the inspector for the facility/level/unit tree
Backend mapping¶
| Loom surface | Real Azure call |
|---|---|
| Sniff format | ADLS REST HEAD /containers/{c}/path/{p} + GET ?range=0-65535 for magic bytes |
| Feature count + bbox (GeoJSON) | Streaming parse via azure-storage-file-datalake Python SDK + ijson |
| Feature count + bbox (GeoParquet) | pyarrow.parquet.read_metadata → geo field block + num_rows |
| Schema + sample | Synapse Serverless OPENROWSET BULK '<adls-https>' FORMAT='PARQUET' TOP 10 (Parquet) or Spark read.json for GeoJSON |
| Convert Shapefile → GeoJSON | Synapse Spark job using geopandas + fiona, output written back to ADLS |
| Reproject | Spark geopandas.to_crs(4326) or Databricks st_transform(geom, 4326) |
| Creator dataset tree | ARM GET .../creators/{c}/datasets/{d}/featurestatesets + WFS GET .../wfs/datasets/{d}/collections |
Required Azure resources¶
- ADLS Gen2 container at
geo/datasets/<id>/(one folder per dataset; reuses existing Loom lakehouse storage) - Synapse Serverless SQL pool (already deployed) for OPENROWSET preview
- Synapse Spark pool (already deployed) for conversion + reprojection jobs
- (optional)
Microsoft.Maps/accounts/creatorsfor Creator dataset mode - Role: managed identity needs
Storage Blob Data Contributoron the container
Build plan¶
| Phase | Work |
|---|---|
| Backend | New /api/items/geo-dataset/[id]/inspect → format sniff + feature count + bbox + geometry type histogram. New /api/items/geo-dataset/[id]/schema → property schema + sample 10 features. New /api/items/geo-dataset/[id]/convert (POST) → submits a Synapse Spark job to convert Shapefile/KML → GeoJSON/GeoParquet. New /api/items/geo-dataset/[id]/reproject (POST) → runs geo-pipeline instance to reproject. |
| Frontend | Replace stub form with: format badge + feature count + bbox map preview at the top; tabs for Schema · Sample features · Validation · Convert/Reproject. Each row in the sample-features grid opens a side panel with the full property JSON and a mini Azure Maps preview. |
| Auto-pairing | On geo-dataset create from a geo-map "create new source", auto-set the path under geo/maps/<mapId>/dataset/<datasetId>/ so the map can read it without extra config. |
Estimated effort¶
1–2 focused sessions. Backend inspect + schema (session 1, mostly Python + Synapse), conversion/reprojection wiring + frontend grid (session 2).