lokalebasen/simply-robotstxt
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
robots.txt is an extremely simple format, but it still needs some parser loving. This class will normalise robots.txt entries for you. USAGE ----- With the following robots.txt: User-agent: * Disallow: /logs User-agent: Google Disallow: /admin Sitemap: http://something.com/sitemap.xml Use it like this: require 'robotstxtparser' # Also accepts a local file rp = RobotsTxtParser.new() rp.read("http://something.com/robots.txt") rp.user_agent('Google') # returns ["/logs", "/admin"] rp.user_agent('Autobot') # returns ["/logs"] rp.sitemaps # returns ["http://something.com/sitemap.xml"]