Skip to content

plasmate-labs/quickstart-node

Repository files navigation

Plasmate Quickstart — Node.js

Use this template Test

A minimal template showing how to use Plasmate from Node.js. Fetch web pages and get back a structured Semantic Object Model (SOM) instead of raw HTML.

Prerequisites

Install Plasmate:

cargo install plasmate

What's Included

Script Description
fetch-page.mjs Fetch a single URL and print the semantic content
batch-fetch.mjs Fetch multiple URLs and save results as JSON
extract-structured-data.mjs Extract headings, links, images, and text from a page

Quick Start

# Clone this template
gh repo create my-scraper --template plasmate-labs/quickstart-node --clone
cd my-scraper

# Fetch a page
node fetch-page.mjs https://news.ycombinator.com

# Extract structured data
node extract-structured-data.mjs https://github.com/trending

# Batch fetch
node batch-fetch.mjs https://example.com https://example.org

How It Works

Plasmate fetches web pages and returns a Semantic Object Model — a structured JSON representation of the page content.

import { execSync } from "node:child_process";

const output = execSync('plasmate fetch "https://example.com"', { encoding: "utf-8" });
const som = JSON.parse(output);

// som = {
//   title: "Example Domain",
//   lang: "en",
//   regions: [
//     {
//       role: "main",
//       id: "content",
//       elements: [
//         { role: "heading", text: "Example Domain", level: 1 },
//         { role: "text", text: "This domain is for use in illustrative examples..." },
//         { role: "link", text: "More information...", href: "https://www.iana.org/domains/example" }
//       ]
//     }
//   ]
// }

License

MIT

About

Quickstart template for using Plasmate with Node.js

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors