-
Notifications
You must be signed in to change notification settings - Fork 6
Description
One of the popular problems in machine learning is dogs vs cats; given a picture predict whether the picture is of a dog or a cat. Coming from this initial experience about machine learning, I kept thinking the problem of classification of changesets as good or problematic is something similar. But, today I did an exercise where I wanted to identify one attribute about the changeset that makes it good or problematic. I started with:
- https://osmcha.mapbox.com/49563062/
highway=residentialis modified tohighway=unclassified
The following questions came to mind
- What could be the source of knowledge to modify?
- Isn't
residentialbetter thanunclassified; I mean something is better than nothing right? - At version
15, this is quite a mature feature. So, is that alright? - What is the length of the highway; smaller should be residential and longer unclassified?
- Why is
source=google mapsReally?
From https://wiki.openstreetmap.org/wiki/Key:highway
- highway=unclassified
The least most important through roads in a country's system – i.e. minor roads of a lower classification than tertiary, but which serve a purpose other than access to properties. Often link villages and hamlets.
- highway=residential
Roads which serve as an access to housing, without function of connecting settlements.
From https://osmlab.github.io/osm-deep-history/#/way/103217436
- The feature has mostly been
highway=unclassifiedsince creation in 2011.
Looking deeper into other changesets where a highway=residential gets modified into highway=unclassified, I find this user, Порфирий who has lots of changesets with the same behavior. Interestingly, the user who added highway=residential is Порфирий too.
Eureka!
When a highway modification has so many questions to answer and attributes to look at, what will the scale be when we look at all 26 primary tags together? What about features that don't have any primary tags? Too many questions! Too many attributes! Right?
- This does not look a traditional cats vs dogs. It is a little something else.
- How about we try something different? How about we build one machine learning model for each object type?
- How would it look when there is a model trained on highway's to classify whether the new/modified highway is a 👍 or a 👎
- Another trained on buildings, another in water bodies, etc and each knew what a good highway looks like and a problematic highway looks like?
- Is this it?


