Skip to content

Possible miscategorizing of aws-hosted sites #50

@S4lt5

Description

@S4lt5

💡 Summary

When running findcdn against a site like www.ahcp.gov, I get the following output:

❯ findcdn list www.achp.gov

{
   "date": "10/27/2022, 14:22:16",
   "cdn_count": "1",
   "domains": {
       "www.achp.gov": {
           "IP": "'3.32.142.183', '3.32.4.248'",
           "cdns": "'.amazonaws.com'",
           "cdns_by_names": "'Amazon AWS'"
       }
   }
}

I also put some debug prints in to see the following values:

HEADERS:  ['Apache']
Whois:  ['AMAZON EXPANSION, IE', 'AMAZON-EC2-USGOVCLD', 'AMAZON EXPANSION, IE', 'AMAZON-EC2-USGOVCLD']

When looking at various static files on the site, none appear to have any indication of being served via CDN (no x-cache-*, no via, no .cloudfront url, etc)

Also looking at the whois data, there appears to be at least some hint that this is an ec2 instance serving static files, which probably does not fall into our "has a CDN" category.

What do you think?

Motivation and context

Accuracy of reported output is important to me, and I'm not sure if there is a change to how such a site should be classified.

Implementation notes

Unsure

Acceptance criteria

Unsure

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementThis issue or pull request will add new or improve existing functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions