-
Notifications
You must be signed in to change notification settings - Fork 328
Description
This is a discussion of API that overtakes exisitng PRs (#512 and #493) in order to plan and clarify API that needs to be implemented.
First of all I propose method name to be base_uri as it better reflects behaviour. I would agree on .origin proposed by @tarcieri, but I think it might be confusing due to URI API:
HTTP::URI.parse("https://example.com/foo/bar").origin
# => "https://example.com" Thus I believe the best name will be base_uri. Also, this method should raise exception if schema-less URI given:
HTTP.base_uri("//example.com/users")
# !! ArgumentError: Invalid base URI: //example.com/users
HTTP.base_uri("/users")
# !! ArgumentError: Invalid base URI: /usersAlso, it should allow chaining of base URIs:
HTTP.base_uri("https://example.com").base_uri("/users")
# => HTTP::Client with base URI="https://example.com/usersURI merging
IMO using Addressable's URI merging logic is straight forward:
uri = HTTP::URI.parse("https://example.com/a/b/c/")
uri.join("z") # => https://example.com/a/b/c/z
uri.join("/z") # => https://example.com/z
uri.join("./z") # => https://example.com/a/b/c/z
uri.join("../z") # => https://example.com/a/b/z
uri.join("//foobar.com/ah/a/") # => https://foobar.com/ah/a
uri.join("http://wut.wut/") # => http://wut.wutThere's one thing that should be handled separately, and we should provide shim for this:
HTTP::URI.parse("https://example.com/foo/bar").join("./baz")
# => https://example.com/foo/baz
HTTP::URI.parse("https://example.com/foo/bar/").join("./baz")
# => https://example.com/foo/bar/bazThe above snippet, IMO, should lead to the same result (foo/bar/baz) in both cases. So we should somewhat normalize base uri to have tailing / if there's none:
def merge_uris(base, appendix)
base = HTTP::URI.parse(base)
base = HTTP::URI.parse("#{base}/") unless base.to_s.end_with? "/"
base.join(appendix)
end
merge_uris("https://example.com/foo/bar", "baz") # => https://example.com/foo/bar/baz
merge_uris("https://example.com/foo/bar", "./baz") # => https://example.com/foo/bar/baz
merge_uris("https://example.com/foo/bar", "../baz") # => https://example.com/foo/baz
merge_uris("https://example.com/foo/bar", "/baz") # => https://example.com/baz
merge_uris("https://example.com/foo/bar/", "baz") # => https://example.com/foo/bar/baz
merge_uris("https://example.com/foo/bar/", "./baz") # => https://example.com/foo/bar/baz
merge_uris("https://example.com/foo/bar/", "../baz") # => https://example.com/foo/baz
merge_uris("https://example.com/foo/bar/", "/baz") # => https://example.com/bazSecurity consideration
I think it's at least worth pointing out (not as a blocker on this PR, just as a general note) that path joining is an area that's fraught with peril, especially if the base URI is assumed to be "secure" but the joined parameter is attacker-controlled and the attacker can exploit either relative path behaviors or completely overwrite an existing path with an absolute one.
-- #513 (comment)
I think if developer is not validating user input - we can't prevent him from stepping a land mine. Consider this dangerous code:
HTTP.base_uri("https://example.com/foo/bar").get(user_input)The above hides same danger as:
base_uri = HTTP::URI.parse("https://example.com/foo/bar')
HTTP.get(base_uri.join(user_input)And of course if one is just concatenating strings he's not doing it much better either:
HTTP.get("#{base_uri}#{user_input}")What we can do better is to allow "locking" base uri, so that it will raise exception on attempt to get unexpected URI:
client = HTTP.base_uri("https://example.com/foo/bar", :enforced => true)
client.get("baz")
# => GET https://example.com/foo/bar/baz
client.get("../baz")
# !! "https://example.com/foo/baz" is outside of base uriHow about persistent options
Something that was completely missed in opened PRs and discussions is how to handle when both persistent and base URI are given. I think this should be handled this way:
client = HTTP.base_uri("https://example.com/foo/bar")
client.persistent("https://gitlab.com")
# !! ArgumentError, persistence origin mismatch base URI
pclient = HTTP.persistent("https://gitlab.com")
pclient.base_uri("https://example.com/foo/bar")
# !! ArgumentError, base URI is outide of persistence originIn other words we should raise an error when there's conflict between the two. And in order to make life a bit easier we can allow:
pclient = HTTP.base_uri("https://example.com/foo/bar").persistentIn other words, if there's base URI, persistent can get its origin by default.