Duration: 20 minutes
For: Developers new to LLMs or AI verification
Goal: Understand core concepts before diving into verification techniques
- 🤖 What is an LLM and how does it work?
- 💭 What are hallucinations and why do they happen?
- ⚖️ Probabilistic vs. Deterministic systems
- 🎯 Why verification is critical for production AI
An LLM is a text prediction machine trained on massive amounts of internet text.
How it works:
- You give it a prompt:
"The capital of France is" - It predicts the most likely next word:
"Paris" - It keeps predicting:
"Paris, known for the Eiffel Tower..."
Key insight: It doesn't "know" facts. It predicts patterns.
Prompt: "2 + 2 ="
LLM predicts: "4" ✅ (saw this pattern a million times)
Prompt: "2843 + 7291 ="
LLM predicts: "10134" ✅ or "9134" ❌ (less common, might guess wrong)✅ Pattern Recognition
- Writing emails
- Translating languages
- Summarizing text
- Code generation (common patterns)
✅ Creative Tasks
- Brainstorming ideas
- Writing stories
- Conversational responses
❌ Precise Calculations
- Math (they predict digits, not calculate)
- Logic (they pattern-match, not reason)
- Code execution (they generate code, can't run it)
❌ Factual Accuracy
- Might mix up dates, numbers, names
- Can't distinguish truth from plausible-sounding fiction
Hallucination: When an LLM generates plausible-sounding but incorrect information.
LLMs are trained to predict plausible text, not true text.
Example:
User: "Who was the first person on Mars?"
LLM: "Neil Armstrong in 1969."
✅ Plausible (Armstrong was first on Moon)
❌ Wrong (no one has been to Mars yet)
-
Factual Errors
- Wrong dates, numbers, names
- "iPhone 15 was released in 2019" ❌
-
Logic Errors
- "If A > B and B > C, then C > A" ❌
-
Calculation Errors
- "15% of $200 is $35" ❌ (should be $30)
-
Invented References
- "According to study XYZ-2023..." (study doesn't exist)
Healthcare: Wrong dosage (could be fatal)
Finance: Incorrect interest calculation ($12,889 error in production)
Legal: Fake case citations (lawyer sanctioned)
E-commerce: Hallucinated discounts (revenue loss)
How it works: Predicts based on patterns and probabilities
# LLM generates different answers each time
llm("What is 2+2?")
→ "4" (90% probability)
→ "2+2 equals four" (8%)
→ "The answer is 4" (2%)Characteristics:
- ✅ Flexible, creative
- ✅ Handles ambiguity well
- ❌ Not 100% reliable
- ❌ Different outputs for same input
Use cases: Creative writing, summarization, conversation
How it works: Follows exact rules, always same output
# Calculator always gives same answer
calculator(2 + 2)
→ 4 (100% certainty, always)Characteristics:
- ✅ 100% reliable (for verifiable tasks)
- ✅ Same input = same output
- ❌ Can't handle ambiguity
- ❌ Needs precise specifications
Examples: SymPy (math), Z3 (logic), compilers (code)
Best practice: Use LLM for generation, deterministic tools for verification
User Query (English)
↓
LLM translates to code
↓
Deterministic engine verifies
↓
Return verified result
This is the neurosymbolic approach QWED uses!
Without Verification:
- Hope LLM is right (73-85% accuracy on finance tasks)
- Manually check outputs (slow, error-prone)
- Ship bugs to production (costly)
With Verification:
- Mathematically prove correctness
- Block errors before they reach users
- Ship with confidence
| Industry | Cost of Error | Example |
|---|---|---|
| Healthcare | Lives | Wrong dosage (1000x overdose) |
| Finance | Revenue | $12,889 calculation error |
| Legal | Sanctions | Fake case citations |
| E-commerce | Trust | Hallucinated discounts |
✅ Must verify:
- Financial calculations
- Medical dosages
- Legal citations
- Security checks
- Regulatory compliance
⏸️ Optional verification:
- Creative writing
- Casual conversation
- Brainstorming
- Subjective opinions
Answer these to test yourself:
-
What is an LLM?
Answer
A text prediction machine that generates likely next words based on training patterns, not facts. -
Why do hallucinations happen?
Answer
LLMs predict plausible text, not true text. They can't distinguish fact from fiction. -
Difference between probabilistic and deterministic?
Answer
Probabilistic (LLM): Flexible but unreliable, different outputs each time. Deterministic (Calculator): Exact same output for same input, 100% reliable. -
When should you use verification?
Answer
When errors have real consequences: money, lives, legal issues, security.
✅ LLMs predict patterns, not facts
✅ Hallucinations are inevitable (not bugs, it's how they work)
✅ Probabilistic ≠ Deterministic
✅ Verification prevents costly errors
Next: Now that you understand the problem, learn how QWED solves it!