Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
57a23e4
initial coding size based geometry prioritization
bertt Jan 29, 2026
4443711
limit geometries to retrieve
bertt Jan 29, 2026
3ce5544
improve performance
bertt Feb 3, 2026
6cc9ebc
add md5 queries
bertt Feb 3, 2026
8c0d428
move md5 queries
bertt Feb 3, 2026
c164c8a
Update md5_queries.md
bertt Feb 3, 2026
8fff1f7
Update md5_queries.md
bertt Feb 3, 2026
5415785
Enhance md5_queries.md with issue and todo sections
bertt Feb 3, 2026
05ab363
Update md5_queries.md
bertt Feb 3, 2026
d0debce
Fix formatting issue in md5_queries.md
bertt Feb 3, 2026
28a9333
Propose exception for first tile on z=0
bertt Feb 3, 2026
ec867b3
filter where hashes based on tile envelope
bertt Feb 4, 2026
c99b726
Document spatial indexing recommendations for MD5 hashes
bertt Feb 4, 2026
4137dac
Fix header formatting and clean up index creation examples
bertt Feb 4, 2026
0b5ae57
Numbered list formatting for query patterns
bertt Feb 4, 2026
05d00bb
Fix heading format in md5_queries.md
bertt Feb 4, 2026
c86df8c
Enhance md5_queries.md with performance notes
bertt Feb 4, 2026
a76f0b1
Update md5_queries.md
bertt Feb 4, 2026
4d7c892
fix octreetiler
bertt Feb 4, 2026
e20e7cf
Merge branch 'md5_implementation' of https://github.com/Geodan/pg2b3d…
bertt Feb 4, 2026
a507f9c
refactor writing tile
bertt Feb 4, 2026
2dc043b
Update md5_queries.md
bertt Feb 4, 2026
c22f184
Initial plan
Copilot Feb 5, 2026
9a5a388
Replace string concatenation with parameterized queries using ANY ope…
Copilot Feb 5, 2026
96c10af
Update md5_queries.md to reflect parameterized query solution
Copilot Feb 5, 2026
b2cdb4d
Improve connection management with try-finally blocks
Copilot Feb 5, 2026
4182510
Update src/b3dm.tileset/SpatialIndexChecker.cs
bertt Feb 5, 2026
d082ce3
Merge pull request #247 from Geodan/copilot/sub-pr-244
bertt Feb 5, 2026
4be180c
Update README.md
bertt Feb 5, 2026
80085d2
add md5 to index check
bertt Feb 5, 2026
d4a1cd7
fix releative path
bertt Feb 5, 2026
b94d6e5
skip redundant check
bertt Feb 5, 2026
3dbcfbd
Update src/b3dm.tileset/QuadtreeTiler.cs
bertt Feb 5, 2026
39d262f
Update src/b3dm.tileset/OctreeTiler.cs
bertt Feb 5, 2026
c73accf
Update src/b3dm.tileset/QuadtreeTiler.cs
bertt Feb 5, 2026
88d5f9e
Update src/b3dm.tileset/QuadtreeTiler.cs
bertt Feb 5, 2026
b67f67b
Update src/b3dm.tileset/QuadtreeTiler.cs
bertt Feb 5, 2026
47d38ed
update octreetiler
bertt Feb 5, 2026
7748bd4
Update src/b3dm.tileset/SpatialIndexChecker.cs
bertt Feb 5, 2026
bf838b7
Update src/b3dm.tileset/QuadtreeTiler.cs
bertt Feb 5, 2026
f893dc0
Update src/b3dm.tileset/QuadtreeTiler.cs
bertt Feb 5, 2026
1145436
Update src/b3dm.tileset/OctreeTiler.cs
bertt Feb 5, 2026
1dc997c
fix typo
bertt Feb 5, 2026
ea33e06
Initial plan
Copilot Feb 5, 2026
9b0a0cb
remove todo
bertt Feb 5, 2026
03c8865
update readme
bertt Feb 5, 2026
4ea012d
Pass processedGeometries to LOD child tiles to maintain deduplication
Copilot Feb 5, 2026
797ed34
Merge branch 'md5_implementation' into copilot/sub-pr-244-again
bertt Feb 5, 2026
48aa307
Update QuadtreeTiler.cs
bertt Feb 5, 2026
777f8b2
Merge pull request #248 from Geodan/copilot/sub-pr-244-again
bertt Feb 5, 2026
f141ec7
remove octree tilehashes
bertt Feb 5, 2026
cd0c4bf
update readme
bertt Feb 5, 2026
ce0434f
Update src/b3dm.tileset/SpatialIndexChecker.cs
bertt Feb 5, 2026
b833424
Merge branch 'md5_implementation' of https://github.com/Geodan/pg2b3d…
bertt Feb 5, 2026
cbf1522
Update README.md
bertt Feb 5, 2026
028a19c
improve error handling
bertt Feb 5, 2026
fe04bf2
Merge branch 'md5_implementation' of https://github.com/Geodan/pg2b3d…
bertt Feb 5, 2026
d5495a4
improve md5 hash index check
bertt Feb 9, 2026
22b9fe3
update solution
bertt Feb 10, 2026
03297e6
update octreetiler
bertt Feb 10, 2026
5a6b4b4
remove todo
bertt Feb 10, 2026
45c749d
add option sortby - area or volume (default area)
bertt Feb 12, 2026
87b3fd2
filter boundingboxes based on original projection (not epsg:4326)
bertt Feb 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 14 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,12 @@ If --username and/or --dbname are not specified the current username is used as

--keep_projection (Default: false) Keep projection of input data

--sortby (Default: AREA) Sort features by AREA/VOLUME
AREA is the default and faster (uses ST_Area).
VOLUME is slower (uses 3D bounding-box volume) and is useful when
geometries have relatively large Z components (e.g. infrastructure /
vertical walls).

--subdivision (Default: QUADTREE) Subdivision schema QUADTREE/OCTREE

--help Display this help screen.
Expand Down Expand Up @@ -220,10 +226,11 @@ For styling see [styling 3D Tiles](styling.md)
Input geometries must be of type LineString/MultilineString/Polygon/MultiPolygon/PolyhedralSurface (with z values). When the geometry is not triangulated, pg2b3dm will perform
triangulation. Geometries with interior rings are supported.

For large datasets create a spatial index on the geometry column:
For large datasets create the following indexes:

```
psql> CREATE INDEX ON the_table USING gist(st_centroid(st_envelope(geom_triangle)));
psql> CREATE INDEX ON the_table USING gist (st_centroid(st_envelope(geom)));
psql> CREATE INDEX ON the_table using btree(md5(st_asbinary(geom)::text));
```

When there the spatial index is not present the following warning is shown.
Expand Down Expand Up @@ -264,7 +271,11 @@ When the input geometries are distributed in a flat area (like buildings in a ci

OCTREE is used when the input geometries are distributed in a cube-like area.

Most features are supported when using OCTREE subdivision, except LOD support;
Most features are supported when using OCTREE subdivision, except

- LOD support;

- Update boundingboxes when using explicit tiling;

## Query parameter

Expand Down
134 changes: 134 additions & 0 deletions md5_queries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
## Queries for MD5


### Initial

1] Get bounding box whole table (1.9 s)

```sql
SELECT st_xmin(geom1),st_ymin(geom1), st_xmax(geom1), st_ymax(geom1), st_zmin(geom1), st_zmax(geom1) FROM (select st_transform(ST_3DExtent(geom), 4979) as geom1 from bertt.nantes_reconstructed_buildings ) as t
```
Result:

```
-1.8471041030488762 47.14626298148698 -1.1473131952502678 47.62268076404559 34.427586472817715 475.03764899302183
```

## Tile 0_0_0.glb

2] Count geometries in bounding box (0.2s)

```sql
SELECT COUNT(geom) FROM bertt.nantes_reconstructed_buildings WHERE ST_Centroid(ST_Envelope(geom)) && st_transform(ST_MakeEnvelope(-1.847105103048876, 47.14626198148698, -1.1473121952502678, 47.62268176404559, 4326), 5698)
```

Result: 385856

3] Get geometries for tile 0_0_0.glb - 1000 largest geometries in whole table (2 s)

```sql
SELECT ST_AsBinary(st_transform(geom, 4978)), id , MD5(ST_AsBinary(geom)::text) as geom_hash FROM bertt.nantes_reconstructed_buildings where ST_Centroid(ST_Envelope(geom)) && st_transform(ST_MakeEnvelope(-1.847105103048876, 47.14626198148698, -1.1473121952502678, 47.62268176404559, 4326), 5698) ORDER BY ST_Area(ST_Envelope(geom)) DESC LIMIT 1000
```

md5 hashes (for example '9759cdee666f512a0c13df8245b667f9') are remembered to be excluded in higher level (z) tile

potential improvement: make exception for first tile on z=0 - do not filter on envelope (all features are included)

## Tile 1_0_0.glb (level 1, x=0, y=0)

4] Filter the hashes from previous list to only geometries within tile 1_0_0

```sql
SELECT MD5(ST_AsBinary(geom)::text) as geom_hash
FROM bertt.nantes_reconstructed_buildings
WHERE MD5(ST_AsBinary(geom)::text) = ANY($1)
AND ST_Within(
ST_Centroid(ST_Envelope(geom)),
ST_Transform(ST_MakeEnvelope($2, $3, $4, $5, 4326), 5698)
)
```

Note: Using parameterized query with array parameter instead of string concatenation.

5] Count geometries in bounding box on level 1 excluding the geometries from tile 0_0_0.glb, including only the geometries within the tile

```sql
SELECT COUNT(geom) FROM bertt.nantes_reconstructed_buildings WHERE ST_Centroid(ST_Envelope(geom)) && st_transform(ST_MakeEnvelope(-1.847105103048876, 47.14626198148698, -1.497208649149572, 47.384471872766284, 4326), 5698) AND MD5(ST_AsBinary(geom)::text) != ALL($1)
```

Note: Using parameterized query with array parameter instead of string concatenation.

Result: 235787

6] Get geometries for tile 1_0_0.glb - 1000 largest geometries in tile 1_0_0

```sql
SELECT ST_AsBinary(st_transform(geom, 4978)), id , MD5(ST_AsBinary(geom)::text) as geom_hash FROM bertt.nantes_reconstructed_buildings where ST_Centroid(ST_Envelope(geom)) && st_transform(ST_MakeEnvelope(-1.847105103048876, 47.14626198148698, -1.497208649149572, 47.384471872766284, 4326), 5698) AND MD5(ST_AsBinary(geom)::text) != ALL($1) ORDER BY ST_Area(ST_Envelope(geom)) DESC LIMIT 1000
```

Note: Using parameterized query with array parameter instead of string concatenation.

## Issue

List of hashes can get long (maximum z*1000 items). Previously this was handled with string concatenation which could lead to performance issues and potential SQL injection vulnerabilities.

**Solution**: Now using parameterized queries with PostgreSQL's `= ANY()` and `!= ALL()` operators for better performance and security.

## Spatial indexing

Recommended Indexes

1. Spatial Index with MD5 Hash (Composite)

CREATE INDEX idx_geom_centroid_hash ON the_table
USING btree(MD5(ST_AsBinary(geom_triangle)::text));

2. Spatial Index (GIST) - Still Required

CREATE INDEX idx_geom_centroid_spatial ON the_table
USING gist(ST_Centroid(ST_Envelope(geom_triangle)));

Rationale

The queries now use three main patterns:

1] Spatial filtering with MD5 hash exclusion (GetGeometrySubset): WHERE ST_Centroid(ST_Envelope(geom_triangle)) && <envelope>
AND MD5(ST_AsBinary(geom_triangle)::text) != ALL($1)

2] MD5 hash filtering with spatial validation (FilterHashesByEnvelope): WHERE MD5(ST_AsBinary(geom_triangle)::text) = ANY($1)
AND ST_Within(ST_Centroid(ST_Envelope(geom_triangle)), <envelope>)

3] Hash-only filtering (GetGeometriesBoundingBox): WHERE MD5(ST_AsBinary(geom_triangle)::text) = ANY($1)

Performance Notes:

1] The GIST spatial index handles the ST_Centroid(ST_Envelope(geom_triangle)) predicates

2] The MD5 hash BTREE index handles the MD5(ST_AsBinary(geom_triangle)::text) = ANY/!= ALL predicates

3] PostgreSQL will use both indexes (bitmap index scan) for queries with both predicates

4] Using parameterized queries with ANY/ALL operators provides better performance than string-concatenated IN/NOT IN clauses

Optional: Materialized Hash Column

## Solution

The hash filtering now uses PostgreSQL's `= ANY(@param)` operator with array parameters instead of string concatenation:

1. **Hash Inclusion (IN clause)**: Changed from `MD5(...) IN ('hash1', 'hash2', ...)` to `MD5(...) = ANY(@hashes)` with parameterized array
2. **Hash Exclusion (NOT IN clause)**: Changed from `MD5(...) NOT IN ('hash1', 'hash2', ...)` to `MD5(...) != ALL(@excludeHashes)` with parameterized array

Benefits:
- Eliminates SQL injection risk (even though MD5 hashes are predictable)
- Better performance with large hash lists
- Cleaner, more maintainable code
- Proper use of parameterized queries

## Todo

- ~~idea: make a temporary blacklist table with the to be exluded hashes?~~ (Solved using parameterized arrays)

- idea: force use of id column (longs)?

- ~~Other solutions?~~ (Implemented using `ANY` and `ALL` operators)
Empty file added src/README.md
Empty file.
1 change: 0 additions & 1 deletion src/b3dm.tileset.tests/CesiumTilerTests.cs
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
using System;
using System.Collections.Generic;
using System.IO;
using B3dm.Tileset.settings;
using NUnit.Framework;
using pg2b3dm;
using subtree;
Expand Down
29 changes: 29 additions & 0 deletions src/b3dm.tileset.tests/GeometryRepositoryOrderByTests.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
using NUnit.Framework;

namespace B3dm.Tileset.Tests;

public class GeometryRepositoryOrderByTests
{
[Test]
public void GetOrderBy_Area_UsesEnvelopeArea()
{
var sql = GeometryRepository.GetOrderBy("geom", SortBy.AREA);
Assert.That(sql, Does.Contain("ST_Area(ST_Envelope(geom))"));
Assert.That(sql, Does.Contain("DESC"));
}

[Test]
public void GetOrderBy_Volume_UsesZExtents()
{
var sql = GeometryRepository.GetOrderBy("geom", SortBy.VOLUME);
Assert.That(sql, Does.Contain("ST_ZMax(geom)"));
Assert.That(sql, Does.Contain("ST_ZMin(geom)"));
}

[Test]
public void TilingSettings_DefaultSortBy_IsArea()
{
var settings = new TilingSettings();
Assert.That(settings.SortBy, Is.EqualTo(SortBy.AREA));
}
}
61 changes: 61 additions & 0 deletions src/b3dm.tileset.tests/GeometryRepositoryWhereTests.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
using NUnit.Framework;
using Wkx;

namespace B3dm.Tileset.Tests;

public class GeometryRepositoryWhereTests
{
[Test]
public void GetWhere_2D_UseSourceEpsgDirectly()
{
var from = new Point(100000.0, 400000.0);
var to = new Point(200000.0, 500000.0);

var sql = GeometryRepository.GetWhere("geom", from, to, "", 5698);

Assert.That(sql, Does.Contain("5698"));
Assert.That(sql, Does.Not.Contain("4326"));
Assert.That(sql, Does.Not.Contain("ST_Transform"));
Assert.That(sql, Does.Contain("ST_MakeEnvelope"));
}

[Test]
public void GetWhere_2D_Epsg4326_UseSourceEpsgDirectly()
{
var from = new Point(-75.8, 38.4);
var to = new Point(-75.0, 39.8);

var sql = GeometryRepository.GetWhere("geom", from, to, "", 4326);

Assert.That(sql, Does.Contain("4326"));
Assert.That(sql, Does.Not.Contain("ST_Transform"));
Assert.That(sql, Does.Contain("ST_MakeEnvelope"));
}

[Test]
public void GetWhere_3D_UseSourceEpsgDirectly()
{
var from = new Point(100000.0, 400000.0, 200.0);
var to = new Point(200000.0, 500000.0, 300.0);

var sql = GeometryRepository.GetWhere("geom", from, to, "", 5698);

Assert.That(sql, Does.Contain("5698"));
Assert.That(sql, Does.Not.Contain("4979"));
Assert.That(sql, Does.Not.Contain("4326"));
Assert.That(sql, Does.Not.Contain("ST_Transform"));
Assert.That(sql, Does.Contain("ST_3DMakeBox"));
}

[Test]
public void GetWhere_3D_ContainsCentroidAndSrid()
{
var from = new Point(100000.0, 400000.0, 200.0);
var to = new Point(200000.0, 500000.0, 300.0);

var sql = GeometryRepository.GetWhere("geom", from, to, "", 5698);

Assert.That(sql, Does.Contain("st_setsrid"));
Assert.That(sql, Does.Contain("ST_3DIntersects"));
}
}
28 changes: 27 additions & 1 deletion src/b3dm.tileset/BoundingBoxRepository.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using System.Data;
using System;
using System.Data;
using Wkx;

namespace B3dm.Tileset;
Expand All @@ -16,6 +17,31 @@ public static (BoundingBox bbox, double zmin, double zmax) GetBoundingBoxForTabl
return bbox3d;
}

public static (BoundingBox bbox, double zmin, double zmax) GetBoundingBoxAs4979(IDbConnection conn,(BoundingBox bbox, double zmin, double zmax) bboxTable, int sourceEpsg)
{
if (sourceEpsg == 4979 || sourceEpsg == 4326) {
return bboxTable;
}
var bbox = bboxTable.bbox;
var sqlBounds = FormattableString.Invariant($@"SELECT st_xmin(geom1),st_ymin(geom1), st_xmax(geom1), st_ymax(geom1), st_zmin(geom1), st_zmax(geom1)
FROM (
SELECT ST_3DExtent(
ST_Transform(
ST_SetSRID(
ST_3DMakeBox(
ST_MakePoint({bbox.XMin}, {bbox.YMin}, {bboxTable.zmin}),
ST_MakePoint({bbox.XMax}, {bbox.YMax}, {bboxTable.zmax})
),
{sourceEpsg}
),
4979
)
) AS geom1
) AS t");
return GetBounds(conn, sqlBounds);
}


private static (BoundingBox, double, double) GetBounds(IDbConnection conn, string sql)
{
conn.Open();
Expand Down
1 change: 0 additions & 1 deletion src/b3dm.tileset/CesiumTiler.cs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
using System.IO;
using System.Linq;
using B3dm.Tileset.Extensions;
using B3dm.Tileset.settings;
using Newtonsoft.Json;
using subtree;
using Wkx;
Expand Down
33 changes: 23 additions & 10 deletions src/b3dm.tileset/FeatureCountRepository.cs
Original file line number Diff line number Diff line change
@@ -1,23 +1,36 @@
using Npgsql;
using System.Collections.Generic;
using System.Linq;
using Npgsql;
using Wkx;

namespace B3dm.Tileset;

public static class FeatureCountRepository
{
public static int CountFeaturesInBox(NpgsqlConnection conn, string geometry_table, string geometry_column, Point from, Point to, string query, int source_epsg, bool keepProjection = false)
public static int CountFeaturesInBox(NpgsqlConnection conn, string geometry_table, string geometry_column, Point from, Point to, string query, int source_epsg, HashSet<string> excludeHashes = null)
{
var select = $"COUNT({geometry_column})";
var where = GeometryRepository.GetWhere(geometry_column, from, to, query, source_epsg, keepProjection);
var where = GeometryRepository.GetWhere(geometry_column, from, to, query, source_epsg);

// Add hash exclusion filter using parameterized query
if (excludeHashes != null && excludeHashes.Count > 0) {
where += $" AND MD5(ST_AsBinary({geometry_column})::text) != ALL(@excludeHashes)";
}

var sql = $"SELECT {select} FROM {geometry_table} WHERE {where}";
conn.Open();
var cmd = new NpgsqlCommand(sql, conn);
var reader = cmd.ExecuteReader();
reader.Read();
var count = reader.GetInt32(0);
reader.Close();
conn.Close();
return count;
try {
using var cmd = new NpgsqlCommand(sql, conn);
if (excludeHashes != null && excludeHashes.Count > 0) {
cmd.Parameters.AddWithValue("excludeHashes", excludeHashes.ToArray());
}
using var reader = cmd.ExecuteReader();
reader.Read();
var count = reader.GetInt32(0);
return count;
}
finally {
conn.Close();
}
}
}
Loading