Skip to content

TechCeo/SecureLLMChatbot-GuardingWithLLMGuard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SecureLLMChatbot-GuardingWithLLMGuard

Author: Yusuf Adamu
Project: Prompt Injection Detection in TinyLLaMA Chatbots using LLM Guard
Environment: Google Colab | Python | Transformers | Hugging Face | Scikit-learn


πŸš€ Overview

This project demonstrates the integration of LLM Guard with the TinyLLaMA-1.1B-Chat model to detect and defend against prompt injection attacks. The system combines input and output scanning to ensure safe and reliable chatbot responses, evaluated using adversarial and safe prompt datasets.


πŸ”’ Key Features

  • βœ… Input sanitization with PromptInjection and BanTopics scanners
  • βœ… Unsafe prompt blocking with risk score explanation
  • βœ… End-to-end pipeline: scan β†’ sanitize β†’ respond or reject
  • βœ… Evaluation on Safe-Gaurd Prompt Injection Dataset
  • βœ… Metrics: Accuracy, F1, ROC-AUC, Confusion Matrix

πŸ“Š Results Summary

Metric Value
Accuracy 94.6%
Precision 99.8%
Recall 82.8%
F1 Score 90.5%
ROC AUC ~0.95

πŸ“ Project Structure

  • LLMGAURD_Project.ipynb – Main implementation in Colab
  • LLMGAURD_Project.py – Full implementation exported from Google Colab
  • README.md – Project overview (this file)

πŸ“¬ Contact

For questions or collaboration: yusufadamu.research@gmail.com

About

Prompt Injection Detection in LLaMA-based Chatbots using LLM Guard

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published