-
Notifications
You must be signed in to change notification settings - Fork 1
Design Overview Interface Reference
Interface that defines the crawer task as a whole. Requires a HttpClient, DocumentFactory, and a UrlFilter for construction and a url to act as the starting point for doing a crawl.
Interface that describes the mechanism which the crawler will use to fetch the documents it needs.
Component that takes a response returned from an Http\Client and uses it to create and return a Document Object
Component that is registered to a DocumentFactory that is responsible for parsing and annotating information in a HTTP\Client Response for a given mime type.
A simple value object with some accessor and mutator functions that represents a resource.
A container that holds zero or more UrlFilter\Rule Objects. This is used by the crawler to limit the scope of the crawl.
A class that represents a condition on which a particular url will be allowed to be crawled.
A class that when bound to a results of a crawl will allow creation to some sort of (hopefully) useful formatted report.