Skip to content

Default namespaces make rules unintuitive and complex #8

@ghost

Description

Just got an email from a user trying to parse an Atom feed (http://feeds.feedburner.com/blogspot/hsDu?format=xml) and not getting any results for entry titles with the rule "/feed/entry/title".

Upon further investigation it looks like that feed defines "http://www.w3.org/2005/Atom" as a default namespace (xmlns arg in the root), which requires the rule actually becomes:
/[http://www.w3.org/2005/Atom]feed/[http://www.w3.org/2005/Atom]entry/[http://www.w3.org/2005/Atom]title

because all the default entries are technically defined in that namespace.

It would be nice if there was a way (either through automatic detection inside the parser) or through an explicit property that could be set to tell the parser to ignore the default namespace and treat it as a no-namespace so when the current path is stored inside the parser, no namespace is appended when it equals the default and then rules like "/feed/entry/title" would work without issue.

Might be worth considering to make this property turned on by default.

Only issue I can think of with that is that all existing literally-correct rules that are written like the one above would all start suddenly failing and would probably drive the authors nuts since they are technically correct-er-er with their pedantic rule definition.

Need to think about this more.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions