Skip to content

a-sehic-dev/web-scraper-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraper API

FastAPI service that fetches a URL and returns the page title and number of links. Built for reliability (timeout, redirects, optional SSL verification) and clear error messages.

Features

  • GET /scrape — Scrape any URL; returns url, title, links_found
  • Query param verify_ssl (default true) — Set to false to skip SSL verification for HTTPS (e.g. on Windows or behind proxy)
  • Timeout 15s, User-Agent, redirects enabled
  • Clear errors: timeout, SSL failure, invalid URL

Setup

Optional: create and use a virtual environment:

python -m venv venv
venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Run

uvicorn main:app --reload

API runs at http://127.0.0.1:8000.

Swagger (interactive docs)

http://127.0.0.1:8000/docs

Try the /scrape endpoint from the browser.

Example requests

HTTP (no SSL):

http://127.0.0.1:8000/scrape?url=http://info.cern.ch

HTTPS with SSL verification disabled (e.g. if you get certificate errors):

http://127.0.0.1:8000/scrape?url=https://example.com&verify_ssl=false

Response shape:

{
  "url": "https://example.com",
  "title": "Example Domain",
  "links_found": 1
}

About

Python web scraping API built with FastAPI that extracts structured data from websites and returns clean JSON responses.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages