Skip to content

jaumebalust/simple-offtopic-node

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple-offtopic

Detect off-topic user prompts using local sentence embeddings. No API keys, no network calls at inference time — just fast, private, on-device classification.

How it works

simple-offtopic embeds your topic description and example prompts into vectors using a local all-MiniLM-L6-v2 model (~90 MB, cached after first download). When you check a user prompt, it computes cosine similarity against those reference vectors and compares the result to your threshold.

                   ┌─────────────────────┐
   "How do I       │  Sentence Embedding │    cosine
    return this?" ─┤  (all-MiniLM-L6-v2) ├──► similarity ──► on-topic (0.82)
                   └─────────────────────┘    vs. refs

Install

npm install simple-offtopic

@huggingface/transformers bundles sharp (a native image processing library) as a dependency. Since simple-offtopic only uses text embeddings, you don't need it. If npm install fails with a sharp build error, install with:

npm install --ignore-scripts simple-offtopic

Quick start

import { SimpleOfftopic } from 'simple-offtopic';

const checker = new SimpleOfftopic({
  topic: "Customer support for an e-commerce platform selling electronics",
  threshold: 0.5,
  examples: [
    "How do I return a product?",
    "What's your warranty policy?",
    "My order hasn't arrived yet",
  ],
});

await checker.initialize(); // loads model + pre-computes embeddings

const result = await checker.check("What's the weather like today?");
console.log(result);
// { isOffTopic: true, confidence: 0.23 }

const result2 = await checker.check("Can I get a refund?");
console.log(result2);
// { isOffTopic: false, confidence: 0.81 }

API

new SimpleOfftopic(config)

Parameter Type Default Description
topic string required A description of the intended topic
threshold number 0.5 Similarity score below which a prompt is considered off-topic (0–1)
examples string[] [] Example on-topic prompts to improve accuracy

checker.initialize(): Promise<void>

Loads the embedding model and pre-computes reference vectors for the topic and examples. Must be called before check().

The model is downloaded on first use (~90 MB) and cached locally by @huggingface/transformers.

checker.check(prompt): Promise<CheckResult>

interface CheckResult {
  isOffTopic: boolean;  // true if confidence < threshold
  confidence: number;   // 0–1, max cosine similarity to reference vectors
}

Tips

  • More examples = better accuracy. Adding 3–5 representative on-topic prompts significantly improves classification.
  • Tune the threshold. Start at 0.5 and adjust based on your use case. Lower values are more permissive; higher values are stricter.
  • The topic description matters. A specific description like "Customer support for an e-commerce platform selling electronics" works better than "customer support".
  • First call is slower. The model download is ~90 MB. Subsequent calls use the cache and are fast.

Requirements

  • Node.js 18+
  • ESM ("type": "module" in your package.json or .mjs files)

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published