-
Notifications
You must be signed in to change notification settings - Fork 3
Understanding Outputs
This guide explains the output files generated by the Exposome Geocoder and how to interpret the results.
All geocoded results are saved in the output/ folder within your project directory:
ExposomeGeocodingProject/
├── input_address/
├── output/ # Generated by geocoder
│ ├── coordinates_from_address_<timestamp>.zip
│ ├── geocoded_fips_codes_<timestamp>.zip
│ ├── LOCATION.csv # If provided as input
│ ├── LOCATION_HISTORY.csv # If provided as input
│ └── log/ # Error logs (if any)
Note:
<timestamp>format:YYYYMMDD_HHMMSS(e.g.,20250624_150230)
| File | Contents |
|---|---|
coordinates_from_address_<timestamp>.zip |
Input data + latitude/longitude coordinates |
geocoded_fips_codes_<timestamp>.zip |
Input data + FIPS codes + geocoding results |
LOCATION.csv |
Updated location table (if provided as input) |
LOCATION_HISTORY.csv |
Location history table (copied from input) |
<original_filename>_with_coordinates.csv (inside ZIP)
Input: patients_address.csv
Output: patients_address_with_coordinates.csv
| Column Name | Description | Example |
|---|---|---|
latitude |
Geocoded latitude in decimal degrees | 30.276390 |
longitude |
Geocoded longitude in decimal degrees | -81.399170 |
score |
Geocoding confidence score (0-1) | 0.978 |
precision |
Geocoding precision level | range, street, zip |
geocode_result |
Geocoding outcome status | geocoded, Imprecise Geocode |
reason |
Failure reason (if applicable) | Street missing, Zip missing |
Original input:
| street | city | state | zip | year | entity_id |
|---|---|---|---|---|---|
| 1250 W 16th St | Jacksonville | FL | 32209 | 2019 | 1 |
Output with coordinates:
| street | city | state | zip | year | entity_id | latitude | longitude | score | precision | geocode_result | reason |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1250 W 16th St | Jacksonville | FL | 32209 | 2019 | 1 | 30.276390 | -81.399170 | 0.978 | street | geocoded |
<original_filename>_with_fips.csv (inside ZIP)
Input: patients_address.csv
Output: patients_address_with_fips.csv
| Column Name | Description | Example |
|---|---|---|
fips |
11-digit Census Tract identifier | 12031000100 |
fips_state |
2-digit state FIPS code | 12 |
fips_county |
3-digit county FIPS code | 031 |
fips_tract |
6-digit tract code | 000100 |
census_block_group_2010 |
2010 Census block group | 12031000100 |
census_tract_id_2010 |
2010 Census tract ID | 12031000100 |
census_block_group_2020 |
2020 Census block group | 12031000101 |
census_tract_id_2020 |
2020 Census tract ID | 12031000101 |
| entity_id | latitude | longitude | fips | fips_state | fips_county | fips_tract | geocode_result |
|---|---|---|---|---|---|---|---|
| 1 | 30.276390 | -81.399170 | 12031000100 | 12 | 031 | 000100 | geocoded |
The reason column explains why geocoding failed or was imprecise:
| Reason | Description | Solution |
|---|---|---|
Hospital address given |
Detected known hospital address | Acceptable if intentional; verify addresses |
Street missing |
No street information provided | Add street address to input |
Blank/Incomplete address |
Address field is empty or incomplete | Complete address data |
Zip missing |
ZIP code not provided | Add ZIP code to input |
| (empty) | No issues detected | — |
The geocoder flags known hospital addresses to prevent:
- Using institutional addresses instead of residential addresses
- Imprecise location data for patient exposure analysis
Expanding hospital detection:
You can add known hospital addresses to HOSPITAL_ADDRESSES in the Address_to_FIPS.py script.
Format requirements:
- Single-line string
- Lowercase letters and numbers only
- No commas or special characters
- Fields separated by single spaces
- Example:
"1600 sw archer rd gainesville fl 32610"
output/
├── OMOP_data/
│ ├── valid_address/ # Records with address, no lat/lon
│ ├── invalid_lat_lon_address/ # Records missing both
│ └── valid_lat_long/ # Records with lat/lon
│
├── OMOP_FIPS_result/
│ ├── address/
│ │ ├── address_with_coordinates.zip
│ │ └── address_with_fips.zip
│ ├── latlong/
│ │ └── latlong_with_fips.zip
│ └── invalid/ # Usually empty
│
├── LOCATION.csv
└── LOCATION_HISTORY.csv
| Folder | Contents | Processing |
|---|---|---|
valid_address/ |
Records with complete address but no coordinates | Geocoded to get lat/lon, then FIPS |
valid_lat_long/ |
Records with existing coordinates | Directly converted to FIPS |
invalid_lat_lon_address/ |
Records missing both address and coordinates | Cannot be geocoded |
| Subfolder | Contents |
|---|---|
address/address_with_coordinates.zip |
Addresses converted to coordinates |
address/address_with_fips.zip |
Addresses converted to FIPS codes |
latlong/latlong_with_fips.zip |
Existing coordinates converted to FIPS |
invalid/ |
Unusable records (typically empty) |
After understanding your outputs:
- Data cleaning: Handle failed or imprecise geocodes
- GIS linkage: Proceed to GIS Linkage to link with SDoH data