Skip to content

Prototyping Gabbar for highway features #69

@bkowshik

Description

@bkowshik

One of the popular problems in machine learning is dogs vs cats; given a picture predict whether the picture is of a dog or a cat. Coming from this initial experience about machine learning, I kept thinking the problem of classification of changesets as good or problematic is something similar. But, today I did an exercise where I wanted to identify one attribute about the changeset that makes it good or problematic. I started with:

screen shot 2017-06-16 at 9 15 25 am

The following questions came to mind

  • What could be the source of knowledge to modify?
  • Isn't residential better than unclassified; I mean something is better than nothing right?
  • At version 15, this is quite a mature feature. So, is that alright?
  • What is the length of the highway; smaller should be residential and longer unclassified?
  • Why is source=google maps Really?

From https://wiki.openstreetmap.org/wiki/Key:highway

  • highway=unclassified

The least most important through roads in a country's system – i.e. minor roads of a lower classification than tertiary, but which serve a purpose other than access to properties. Often link villages and hamlets.

  • highway=residential

Roads which serve as an access to housing, without function of connecting settlements.

From https://osmlab.github.io/osm-deep-history/#/way/103217436

  • The feature has mostly been highway=unclassified since creation in 2011.

screen shot 2017-06-16 at 9 19 59 am

Looking deeper into other changesets where a highway=residential gets modified into highway=unclassified, I find this user, Порфирий who has lots of changesets with the same behavior. Interestingly, the user who added highway=residential is Порфирий too.

screen shot 2017-06-16 at 9 30 27 am

Eureka!

When a highway modification has so many questions to answer and attributes to look at, what will the scale be when we look at all 26 primary tags together? What about features that don't have any primary tags? Too many questions! Too many attributes! Right?

  • This does not look a traditional cats vs dogs. It is a little something else.
  • How about we try something different? How about we build one machine learning model for each object type?
  • How would it look when there is a model trained on highway's to classify whether the new/modified highway is a 👍 or a 👎
  • Another trained on buildings, another in water bodies, etc and each knew what a good highway looks like and a problematic highway looks like?
  • Is this it?

cc: @anandthakker @geohacker @batpad

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions