Skip to content

Return 404 instead of 400 responses for obviously-invalid URLs #1434

@robertknight

Description

@robertknight

Requests for "obviously invalid" URLs like https://via.hypothes.is/wp-admin return 400 responses instead of 404. This is inconvenient because we cannot easily filter out such responses in eg. New Relic metrics which monitor the overall error rate of the service.

We have encountered situations when a bot hits a large number of URLs like this in a short window of time, typically looking for vulnerabilities in common PHP packages. This triggered an alarm that fires when 80%+ of the service's requests are failing for a period of time (10-15 minutes).

The reason for the 400 here is that /wp-admin matches the general route for proxying websites which treats the part after the initial / as a URL, where the protocol is optional. CheckmateClient.check_url fails to parse wp-admin as a public URL and raises BadURL, which results in a 400 response.

For context, see https://hypothes-is.slack.com/archives/C074BUPEG/p1728300410941439?thread_ts=1728292002.576029&cid=C074BUPEG.

New Relic alert: https://one.newrelic.com/alerts/issue?account=1385283&duration=259200000&state=e0b2c426-026d-27ee-4aa8-b0894fb965d1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions