SkillBridge AI — Event-Driven Career Gap Intelligence Platform

Scenario 2: Career Gap Intelligence Platform

Author

Ashlesha T | ashleshat5@gmail.com

[ Demo Video: https://youtu.be/qfjP9bxnLAo ]

SkillBridge AI ingests resumes and job descriptions, performs semantic skill-gap analysis via a dual-stream neural network, and delivers personalized 4-week learning roadmaps — all on an async, Kafka-driven pipeline.

Key Features

Dual-Stream MLP — tabular features (KNN distances, role, seniority) + raw 384-dim skill embedding fused for priority scoring
Async Pipeline — FastAPI → Kafka → Workers → ChromaDB → WebSocket push
Idempotency + DLQ — Redis-backed deduplication; failed events persisted to PostgreSQL dead-letter queue with exponential backoff
Groq LLM Roadmaps — Llama 3.1 with 8 s timeout and template fallback
Observability — Prometheus metrics exposed at /metrics; Grafana dashboards

Architecture

Stack

Layer	Technology
API	FastAPI (async)
Message Queue	Kafka 7.6 (KRaft — no Zookeeper)
Cache / Idempotency	Redis 7
Database	PostgreSQL 15 + SQLAlchemy async
Vector Store	ChromaDB
Embeddings	all-MiniLM-L6-v2 (384-dim, pre-trained)
Gap Priority Model	PyTorch DualStreamMLP + sklearn KNN
LLM	Groq — Llama 3.1 8b
Observability	Prometheus + Grafana

ML Model — DualStreamMLP

Tabular features (18-dim): 8 skill-category one-hot + 4 role one-hot + 3 seniority one-hot + gap_score + mean_knn_dist + min_knn_dist

Embedding (384-dim): raw all-MiniLM-L6-v2 output of the JD skill name

Training: FocalLoss(γ=2.0), Adam(lr=1e-3, wd=1e-4), CosineAnnealingWarmRestarts(T₀=30), best-checkpoint saving, seed=42

Metrics (val set, 205 samples):

Metric	Value
F1	0.85
Accuracy	0.91
R²	0.72
MAE	0.09

Quick Start

1. Prerequisites

Docker & Docker Compose
Groq API key — get one free at console.groq.com

2. Environment

Create .env in the project root:

GROQ_API_KEY=gsk_your_key_here

POSTGRES_DB=skillbridge
POSTGRES_USER=skill
POSTGRES_PASSWORD=password
DATABASE_URL=postgresql+asyncpg://skill:password@postgres:5432/skillbridge

REDIS_URL=redis://redis:6379
KAFKA_BOOTSTRAP_SERVERS=kafka:9092
CHROMA_PERSIST_DIR=/app/chroma

3. Start the full stack

docker compose up -d --build

This starts 9 services: PostgreSQL, Redis, Kafka (KRaft), API, 4 workers, Prometheus, Grafana.

Wait ~30 s for Kafka to be healthy, then check:

docker compose ps          # all services should show "healthy" or "running"
curl localhost:8000/health

Note Frontend:

cd frontend
npm install
npm run dev

4. API Endpoints

POST  /api/resume/text          Submit resume as JSON
POST  /api/resume/upload        Upload PDF resume
POST  /api/jd/text              Submit JD as JSON
POST  /api/jd/url               Submit JD URL (scraper)
GET   /api/roadmap/{rid}/{jid}  Generate 4-week roadmap
POST  /api/chat/{session_id}    Career chat
WS    /ws/{session_id}          Real-time analysis push
GET   /metrics                  Prometheus metrics
GET   /health                   Health check

Running Tests

Install test dependencies (one-time, uses project venv):

.venv/bin/pip install pytest pytest-asyncio

Run all tests:

.venv/bin/python -m pytest backend/tests/ -v

Expected output: 16 passed

backend/tests/test_edge_cases.py::test_scraper_malformed_url_returns_none        PASSED
backend/tests/test_edge_cases.py::test_scraper_connection_error_returns_none     PASSED
backend/tests/test_edge_cases.py::test_scraper_http_error_returns_none           PASSED
backend/tests/test_edge_cases.py::test_groq_timeout_returns_fallback_roadmap     PASSED
backend/tests/test_edge_cases.py::test_groq_api_error_returns_fallback           PASSED
backend/tests/test_edge_cases.py::test_chat_response_groq_timeout_returns_string PASSED
backend/tests/test_edge_cases.py::test_fallback_engine_no_templates_returns_generic PASSED
backend/tests/test_edge_cases.py::test_fallback_roadmap_has_4_weeks              PASSED
backend/tests/test_gap_analysis.py::test_result_has_all_required_fields          PASSED
backend/tests/test_gap_analysis.py::test_skill_gaps_sorted_by_priority_descending PASSED
backend/tests/test_gap_analysis.py::test_top_gap_has_highest_priority            PASSED
backend/tests/test_gap_analysis.py::test_overall_match_bounded                   PASSED
backend/tests/test_gap_analysis.py::test_resume_not_found_raises_value_error     PASSED
backend/tests/test_gap_analysis.py::test_jd_not_found_raises_value_error         PASSED
backend/tests/test_gap_analysis.py::test_empty_jd_skills_returns_zero_match      PASSED
backend/tests/test_gap_analysis.py::test_empty_resume_all_skills_flagged_missing PASSED

Test Coverage

test_gap_analysis.py — Happy Path (4)

Test	What it verifies
`test_result_has_all_required_fields`	Response contains `resume_id`, `jd_id`, `overall_match`, `skill_gaps`, `missing_skills`
`test_skill_gaps_sorted_by_priority_descending`	`skill_gaps` list is ordered highest → lowest priority
`test_top_gap_has_highest_priority`	`skill_gaps[0]` is always the most critical skill to learn
`test_overall_match_bounded`	`overall_match` is always in `[0.0, 1.0]`

test_gap_analysis.py — Negative (4)

Test	What it verifies
`test_resume_not_found_raises_value_error`	Missing resume ID → `ValueError` (not 500)
`test_jd_not_found_raises_value_error`	Missing JD ID → `ValueError` (not 500)
`test_empty_jd_skills_returns_zero_match`	JD with no skills → `overall_match=0.0`, `skill_gaps=[]`
`test_empty_resume_all_skills_flagged_missing`	Resume with no skills → all JD skills in `missing_skills`

test_edge_cases.py — Scraper (3)

Test	What it verifies
`test_scraper_malformed_url_returns_none`	Unreachable URL → returns `None`, no crash
`test_scraper_connection_error_returns_none`	Network-level failure → graceful `None`
`test_scraper_http_error_returns_none`	HTTP 404 → graceful `None` (route falls back to paste mode)

test_edge_cases.py — LLM / Reliability (5)

Test	What it verifies
`test_groq_timeout_returns_fallback_roadmap`	Groq timeout → `mode="fallback"` roadmap returned
`test_groq_api_error_returns_fallback`	Any Groq exception → fallback + `LLM_FALLBACK_TOTAL` incremented
`test_chat_response_groq_timeout_returns_string`	Chat timeout → non-empty string, no exception propagated
`test_fallback_engine_no_templates_returns_generic`	Missing templates file → generic 4-week roadmap
`test_fallback_roadmap_has_4_weeks`	Every fallback roadmap has exactly `week1`–`week4`

All tests use mocks — no running Kafka, Redis, PostgreSQL, or Groq API required.

Model Training

The priority model is trained via a Jupyter notebook (no Docker needed):

cd backend/ml
jupyter lab train_gap_model.ipynb   # or open in VS Code
# Run all cells — takes ~2 min on CPU
# Saves: ml_artifacts/gap_model.pt + ml_artifacts/knn_model.pkl

Observability

Prometheus

Raw metrics at http://localhost:8000/metrics and http://localhost:9090.

Metric	Type	Description
`events_processed_total`	Counter	Kafka events processed per topic
`analysis_completed_total`	Counter	Gap analyses completed
`gap_analysis_latency_seconds`	Histogram	End-to-end analysis latency
`llm_fallback_total`	Counter	Groq failures → fallback activations
`dlq_depth`	Gauge	Current dead-letter queue backlog

Grafana Setup

Prometheus datasource is auto-provisioned — no manual config needed.

Open http://localhost:3001 — login admin / admin
Dashboards → New → Add visualization and use these queries:

Panel	PromQL
Events/sec	`rate(events_processed_total[1m])`
Analysis rate	`rate(analysis_completed_total[1m])`
Latency p95	`histogram_quantile(0.95, gap_analysis_latency_seconds_bucket)`
LLM fallbacks	`increase(llm_fallback_total[5m])`
DLQ depth	`dlq_depth`

FAQ

1. Why use Kafka for a prototype?

While a simple background task would work for a single-user demo, Kafka was chosen to satisfy the event-driven requirements of the case study. It provides:

Resilience: The DLQ (Dead Letter Queue) ensures no analysis is lost if the LLM or ML model crashes.
Async UX: The UI remains responsive while the "heavy" embedding and ML steps happen in the background.
Extensibility: New features (like a "Notification Service" or "Audit Log") can be added as new Kafka consumers without changing the core API logic.

2. Does the AI just add more skills?

No. The AI Resume Enhancer uses a "Semantic Reframing" prompt. It analyzes your existing bullet points and rewrites them to better align with the JD's terminology and priorities without inventing fake experience.

AI Disclosure

This project was built with AI assistance:

Gemini CLI — synthetic dataset generation (data/mlp_training_data.json, data/sample_jobs.json, etc.)
Claude (Anthropic) — architecture design, ML model implementation, test suite, code review

One suggestion I evaluated and rejected: using soft-KNN weighted averaging as the embedding signal instead of the raw 384-dim embedding stream. Soft-KNN would require storing all training embeddings at inference time (memory overhead) and is strictly less expressive than giving the model direct access to the full semantic space.

Tradeoffs & Prioritization:

● What did you cut to stay within the 4–6 hour limit?

User Authentication: Login and signup flows were deferred to focus on core AI logic.
Load/Performance Testing: Load handling hasn't been verified for high-concurrency scenarios.
Complex URL Scraping: Advanced support for JS-rendered job portals was simplified.
Training MLP on more data and probably using more real Data
Here original Data from O*NET and Kaggle were used to Seed gemini cli to produce synthetic data

● What would you build next if you had more time?

Executable Prep Workspace: An integrated environment to track and complete the generated plans.
Gamified Preparation: Implementation of XP, badges, and skill levels.
Robust JD Extraction: Better support for complex URLs (LinkedIn, Greenhouse, etc.).
Professional PDF Output: High-quality exports for reports and roadmaps.
snyk test and sonarqube test for code coverage anmd vulnarability test

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
backend		backend
chroma		chroma
data		data
frontend		frontend
grafana		grafana
ml_artifacts		ml_artifacts
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.snyk		.snyk
DESIGN.md		DESIGN.md
README.md		README.md
docker-compose.sonar.yml		docker-compose.sonar.yml
docker-compose.yml		docker-compose.yml
image-1.png		image-1.png
image-2.png		image-2.png
image.png		image.png
prometheus.yml		prometheus.yml
pytest.ini		pytest.ini
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkillBridge AI — Event-Driven Career Gap Intelligence Platform

Author

Ashlesha T | ashleshat5@gmail.com

Key Features

Architecture

Stack

ML Model — DualStreamMLP

Quick Start

1. Prerequisites

2. Environment

3. Start the full stack

4. API Endpoints

Running Tests

Test Coverage

Model Training

Observability

Prometheus

Grafana Setup

FAQ

1. Why use Kafka for a prototype?

2. Does the AI just add more skills?

AI Disclosure

Tradeoffs & Prioritization:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SkillBridge AI — Event-Driven Career Gap Intelligence Platform

Author

Ashlesha T | ashleshat5@gmail.com

Key Features

Architecture

Stack

ML Model — DualStreamMLP

Quick Start

1. Prerequisites

2. Environment

3. Start the full stack

4. API Endpoints

Running Tests

Test Coverage

Model Training

Observability

Prometheus

Grafana Setup

FAQ

1. Why use Kafka for a prototype?

2. Does the AI just add more skills?

AI Disclosure

Tradeoffs & Prioritization:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages