Skip to content

Latest commit

 

History

History
63 lines (42 loc) · 1.23 KB

File metadata and controls

63 lines (42 loc) · 1.23 KB

Web Scraper API

FastAPI service that fetches a URL and returns the page title and number of links. Built for reliability (timeout, redirects, optional SSL verification) and clear error messages.

Features

  • GET /scrape — Scrape any URL; returns url, title, links_found
  • Query param verify_ssl (default true) — Set to false to skip SSL verification for HTTPS (e.g. on Windows or behind proxy)
  • Timeout 15s, User-Agent, redirects enabled
  • Clear errors: timeout, SSL failure, invalid URL

Setup

Optional: create and use a virtual environment:

python -m venv venv
venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Run

uvicorn main:app --reload

API runs at http://127.0.0.1:8000.

Swagger (interactive docs)

http://127.0.0.1:8000/docs

Try the /scrape endpoint from the browser.

Example requests

HTTP (no SSL):

http://127.0.0.1:8000/scrape?url=http://info.cern.ch

HTTPS with SSL verification disabled (e.g. if you get certificate errors):

http://127.0.0.1:8000/scrape?url=https://example.com&verify_ssl=false

Response shape:

{
  "url": "https://example.com",
  "title": "Example Domain",
  "links_found": 1
}