Skip to content

Conversation

@mark-idleman
Copy link

@mark-idleman mark-idleman commented Jul 11, 2025

Background

In order to improve capacities used for traffic assignment, we decided to look upstream and improve the lanes column of the street export table, which is currently used as a fallback for calculating capacities when volumes aren't available. The goal is to limit cases where the value for lanes is -1 (our current placeholder for "we don't know the lane count") or 0 - ie, ensure that lanes is always positive.

Changes

First, I refactored the existing lane count logic into its own function, calculateDirectedLaneCounts, to make the street exporter code more readable. All the lane calculation logic now lives in a new class, StreetEdgeExporterHelper.

Next, I added a new function guessLaneCountFromTags that's used specifically to estimate the "overall" (ie both directions) lane count in cases where the standard lanes tag isn't present using other OSM data, like the road width, speed limit, and highway type of the link.

Finally, I refactored the overall lane logic to work as follows:

  • First, we parse the 3 possible OSM lane tags directly, lanes, lanes:forward, and lanes:backward
  • If both lanes:forward and lanes:backward are present, we output those values (almost) directly
  • If lanes:backward is missing but lanes:forward is present, we check if the overall lanes tag is present. If it is, we set backward lane count to forward - overall; if not, we assume lanes:forward are the only lanes present and set backward lane count to zero (ie, the road is oneway)
    • identical (but reversed) logic is present when lanes:forward is missing but lanes:backward is present
  • If both lanes:forward and lanes:backward are missing, we need to do some additional inferring:
    • first, we check if the overall lanes tag is present. If it's not, we set overall lane count using the new guessLaneCountFromTags function
    • then, we use logic to decide how forward + backward lane counts should be dispersed using the overall lane count value we have, based on whether or not certain tags exist for the way that would indicate it's a oneway road or not.
      • Note: much of this logic is the same as existing oneway-handling logic, but there are slight differences. Mainly, guessLaneCountFromTags returns an overall lane count estimate, so if we use that inferred overall lane count to
        estimate lanes for a one-way road, we divide it by 2 (giving us an estimate for "lanes in one direction")
  • Finally, no matter what logic from above we used to determine forward/backward lanes to this point, we do some final modifications to enforce consistency with the lanes column and the mode-based accessibility of each link as denoted by the flags column:
    • Any link where ALLOWS_CAR isn't present gets a lane count of 0 - ie, the link is impassable to cars
    • Any link where ALLOWS_CAR is present must have a minimum lane count of 1 - ie, the link must have at least one lane passable to cars

Testing

With these changes, here's the new distribution of lanes in the nationwide street export file:

lanes,occurrences
1,136545425
2,3651620
3,999244
4,280105
5,65812
6,16842
7,2029
8,337
9,100
10,72
11,101
12,56
13,37
14,18
15,2
16,4
17,4
18,4
19,1
20,2
21,6
27,1
29,3
31,3
32,1

Testing old (current master) street export table lanes column, and the resulting capacities, we see that a lot of network links that allow cars have lanes <= 0, and a lot of resulting capacities are <=-0:

SELECT COUNT(*)
FROM model-159019.street_export.street_export_usa_2024_Q4_6ff053b4236375f4dd65cd7aae72627f9b69e52b # old
WHERE lanes <= 0
AND flags LIKE '%ALLOWS_CAR%' # returns 122667587

SELECT COUNT(*)
  FROM mobility.capacities_table_2024_Q4_all_links
  JOIN `model-159019.street_export.street_export_usa_2024_Q4_6ff053b4236375f4dd65cd7aae72627f9b69e52b` s # old
  ON stable_edge_id = stableEdgeId
  WHERE capacity <= 0
  AND flags LIKE '%ALLOWS_CAR%' # returns 147974

Testing new street export table lanes column, and the resulting capacities (capacities_table_2024_Q4_all_links_updated_lanes_column_v2), with changes from this PR, we see that no network links that allow cars have lanes <= 0, and no capacities are <=0 now.

SELECT COUNT(*)
FROM model-159019.street_export.street_export_usa_2024_Q4_8d1bdbe281b22eb1e9601a00a5160638b7ca1fa2 # new
WHERE lanes <= 0
AND flags LIKE '%ALLOWS_CAR%' # returns 0

SELECT COUNT(*)
  FROM mobility.capacities_table_2024_Q4_all_links_updated_lanes_column_v2
  JOIN `model-159019.street_export.street_export_usa_2024_Q4_8d1bdbe281b22eb1e9601a00a5160638b7ca1fa2` s # new
  ON stable_edge_id = stableEdgeId
  WHERE capacity <= 0 
  AND flags LIKE '%ALLOWS_CAR%' # returns 0

Spot-checking in Dallas, here are the spots where capacity is currently zero:
Screenshot 2025-07-11 at 2 14 45 PM

And here's the change in capacities between old table (using street export off of master) and new table, built using this PR, for the same spot in Dallas:
Screenshot 2025-07-11 at 2 15 02 PM

This shows that we've "filled in" capacities for all of the links where they'd previously been missing. I did similar checks, with similar results, in other areas of the USA as well.

-> foursquare map with new vs. old capacities for NYC, dallas, and columbus OH


Finally, I added a check to the street exporter unit test to ensure that lanes is a positive value for all car-accessible roads in the micro nor cal test region we use in this repo

@mark-idleman mark-idleman changed the title Test lane column improvement Improve lane column in street export CSV using OSM metadata Jul 14, 2025
@mark-idleman mark-idleman marked this pull request as ready for review July 14, 2025 19:47
@mark-idleman
Copy link
Author

Two remaining items that we still might consider before I merge here:

  1. is it worth it to add a ceiling to the possible values of the lanes column to filter out really high values? ie, cap everything at, say, 8 lanes max? The occurrences of lanes being more than ~10 are really low, but still might mess things up if included
  2. Once we've settled on the logic in this PR, is it still worth running a MNC places run to check that the changes in capacities won't mess anything up in traffic assignment/places output quality?

@mark-idleman mark-idleman changed the title Improve lane column in street export CSV using OSM metadata Improve lane column logic in street exporter using additional OSM metadata Jul 14, 2025
@rregue
Copy link

rregue commented Jul 14, 2025

Will let @sudatta-mohanty have a look at the code in case his Java is better than mine, but to answer your questions above:

  • ceiling on lanes. maybe, my only thought is that those are tolls, or locations that have a toll both and you have many lanes. Maybe we can eyeball a few and see if it makes sense.
  • on running MNC. Yes, but we need to do it regardless, so if we have Q/Ad this change and the capacities build from this change, I think we should be good to merge and then deal with MNC or places quality.

Copy link

@sudatta-mohanty sudatta-mohanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Excited to see the impact on capacities and in turn on places.


// Width override if available; maxWidth will be infinity when the OSM width tag isn't present
if (maxWidth > 0 && maxWidth < Double.POSITIVE_INFINITY) {
int estimatedLanes = (int) Math.round(maxWidth / 3.5); // assume average lane width of ~3.5m

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have documentation regarding the 3.5m average? From what I remember reading, average lane width on highways is around that value, but on residential roads, it may be closer to ~3-3.1m. Regardless, don't think it would make a lot of difference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-07-15 at 12 52 33 PM

This is what I used 😆 I rounded down a bit in an attempt to better account for smaller roads. But yes, it's very inexact, and we'd need more complicated logic that combined road type + width to better "guess" average lane width.

I'm happy to add this if we think it's important, but my other assumption here is that most roads tagged with maxWidth in OSM are going to be larger, freeway-like road sections, not smaller residential roads. So the higher lane width assumption probably holds pretty well in most cases


// "Guesses" overall (undirected) lane count for OSM Way based on other OSM tags, in cases where the `lanes` tag isn't present.
// Note: this function does not take into account whether or not a road is one-way; logic in calculateDirectedLaneCounts handles
// cases where an inferred ("guessed") overall lane count is applied to one-way roads.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case there is some documentation that guides this guess, it would be good to provide a link to that here.
Otherwise, just a comment saying something like:
"if the road is adjacent a roundabout, then...., if width is provided, then...., if highway type is provided, then..."
would be good for future code readers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No formal documentation here, just some research about what tags are available in OSM and asking chat gpt for some ideas. As far as I could find, there's not a common "formula" people use to deduce lane count info (although if one exists I'd love to use it!)

For now, I added a more informative doc comment to this function to describe the steps better

@mark-idleman
Copy link
Author

Thanks to you both for taking a look here!

@rregue :

  • I checked the links with very high lane counts using the following query in bigquerygeoviz:
  SELECT geometry, stableEdgeId, lanes
	FROM
  `model-159019.street_export.street_export_usa_2024_Q4_8d1bdbe281b22eb1e9601a00a5160638b7ca1fa2`
  WHERE flags LIKE '%ALLOWS_CAR%'
  AND lanes >= 10

Looking around spots like NYC, it does look like most of these links are for toll road entrances, so we probably wouldn't want to cap them artificially. But other spots in the western US look like they might just be OSM errors (I saw some of those last week while testing)....maybe I'll skip applying a hard ceiling for now and see how things work out, it's probably better to err on the side of allowing more lanes (to cover toll booths properly)

  • Sounds good about MNC testing, I'll not worry about it for now and we'll make sure to sanity check things during the MNC run we'll do for places prep

@mark-idleman mark-idleman merged commit d080c1e into original-direction Jul 15, 2025
1 check passed
@mark-idleman mark-idleman deleted the test_lane_column_improvement branch July 15, 2025 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants