Skip to content

Add options to extract additional attributes from source data #11

@jravani

Description

@jravani

Overview

City2TABULA should support a configuration-driven mechanism that allows users to define which attributes they want to extract from the source dataset and map into the final City2TABULA Building and Child Feature tables. This makes the pipeline adaptable to different countries, source formats, and custom research needs, without requiring code changes.

A YAML configuration file will define:

  • Which attributes exist in the source CityDB schema
  • Which of these the user wants to extract
  • How they should be named in the final database
  • Whether they apply to Building or Child Feature
  • Which LoD level applies (if needed)

This feature decouples attribute selection from the pipeline logic.


Requirements

1. Source Tag

The configuration file must allow users to specify the original attribute name as it exists in the source data (e.g., oorspronkelijkbouwjaar, b3_bouwlagen, constructionYear).

2. Custom Tag (Final Column Name)

Users may optionally define a custom English attribute name that will be used as the column name in the City2TABULA output tables.

Rules:

  • Case-sensitive
  • Must follow PostgreSQL column-naming conventions
  • If a custom name is not provided:
    • Use the source tag if it is valid
    • Otherwise automatically generate a safe name with a prefix (e.g., attr_...)

3. Attribute Target

Each attribute must specify whether it belongs to:

  • building
  • child_feature

This determines the destination table in City2TABULA.

4. LoD Level

The config must allow restricting attributes to specific LoD levels if needed:

  • e.g., lod: 2.2 or lod: [2.1, 2.2]

If omitted, the attribute applies to all LoD levels present in the dataset.


Example YAML Structure

attributes:
  - source: "oorspronkelijkbouwjaar"
    name: "construction_year"
    target: "building"

  - source: "b3_bouwlagen"
    name: "storeys"
    target: "building"

  - source: "b3_opp_dak_schuin"
    name: "roof_slope_area"
    target: "child_feature"
    lod: [2.1, 2.2]

  - source: "height"
    name: "height_m"
    target: "building"

Sub-issues

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions