-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Hi,
I'd appreciate some clarification on a confusing area of the CSVW specification. I should state that the specification itself is very clear on the matter, what is unclear to me is the rationale for the decision and the expectations around it; it is the rationale that I'd mainly like a clarification on.
I've been digging through the CSVW specs and working group discussions (in github issues, IRC logs and minutes) trying to uncover the rationale for this myself, and I can't find any mention of it, so I thought I'd raise it here; so I can more clearly understand CSVW's design.
The area of clarification concerns the choice of base URL used for resolution of URI templates. In particular that the URL is that of the table's URL and not the metadata documents URL.
The relevant section of the spec is here and is specifically this sentence describing the process for generating the annotation value:
3: resolving the resulting URL against the base URL of the table url if not null.
I would have expected that the URL for resolution of these templates after prefix expansion would have been either the declared @base in the metadata document, or the metadata documents location. The decision here is highly confusing, and counter intuitive given that nothing else assumes the table url.
- all other properties are resolved in terms of the metadata documents base.
- prefixes are expanded as if they followed the rules of the CSVW
@contextunder a JSON-LD interpretation (I'm being overly careful with my words here because it's not strictly JSON-LD) - language tags are resolved in terms of the context
I've trawled the issues, minutes and commits for some clues as to why this was chosen as the specified behaviour here, and I can't see any explicit mention of it. Though there is a lot of discussion where the working group largely seem opposed to these semantics, and no openly discussed justifications that I can find for it. e.g.
@gkellogg: I believe we said elsewhere that information defined in a metadata file would be localized to that file. Certainly, a title needs to take the
@languagefrom the file in which it is located, doing the same with base URL with, or without@basebeing explicitly defined, is consistent with that. The main thing is that we be consistent, since anything else will only lead to more problems.
link
@iherman: In my view
@baseshould follow the priority that we have established for metadata. This means that@base, say, defined in the user metadata should override the metadata files. That makes things way simpler and it is simply a matter of properly defining things.
link
Yes, I think we should be sure that template values are expanded first. I think it needs to be done when transforming into a Template used by the template processor, and before joining to the metadata base URL. I'll take the action of adding such text.
link
@JeniT: I've clarified that the base URL for the metadata document in the absence of the
@baseproperty in the@contextis the location of the metadata document itself.
It's good practice for URLs to be taken as relative to the location where they're found. For example, the URLs of any imported metadata documents should be relative to the original metadata document or it will be incredibly confusing.
It might be that there are particular properties that should be interpreted as URLs relative to the @id of a table description, but I think we need to define them explicitly as doing so. 6a6d74, which properties do you think they are?
link
@JeniT: I think in many cases the propertyUrl will want to be resolved relative to the location of the document that it's located in (which might be a separate schema file that is shared across multiple files).
I suggest that to handle the requirement for URI templates to sometimes be URLs that are relative to the processed CSV file, we have another special variable, eg _tableUrl, which is the URL of the table (from the url property as it is now). Then you can have things like {_tableUrl}#row={_row} if you want to generate URLs like that to identify rows.
link
Given such a variety of views supporting this view that the metadata document, should be the base for URI resolution, what changed? Is the spec actually correct in this regard? I suspect it is, given @JeniT's comments here, the presence of supporting notes in the spec and some affirmative changes in the git history.
I've spent a long time investigating this issue and wondering if it was a bug in our implementation of the csv2rdf spec, an error in the specification itself, or a misunderstanding on my part. I feel I must be missing something important, a rationale to justify the semantics. Any clarification you can provide would be appreciated. My best guess is it's because a representation centric design of annotations was chosen rather than a model centric one -- however that's probably best left for another discussion!
I appreciate entirely all the work the working group has put into these specs, and am trying my hardest to make the most out of it. So additionally any advice on working around these issues that is clearly in the spec would be appreciated (hardcoding the full URIs everywhere is as far as I'm concerned a last resort). I was for example wondering if I could work around this by resolve these URI's relative to the @id on the table (which is I think one option Jeni suggests above); however some bits of the spec appear ambiguous about if that's how it should work.