Narender Rao Surabhi narendersurabhi

Narender Rao Surabhi

Applied AI Engineer | ML Platform | Agentic Workflows | LLM Pretraining, Fine-Tuning, RLHF | MLOps

I build production-grade ML and AI systems across workflow orchestration, RAG and agent runtimes, LLM pretraining and post-training, release governance, and Kubernetes-native platform operations.

My work is centered on:

Agentic workflow platforms with typed execution contracts, tools, capabilities, memory, and triggers
LLM and RAG systems with evaluation, observability, and operational guardrails
LLM training and adaptation workflows including pretraining, supervised fine-tuning, and RLHF-style preference optimization
ML platform and release pipelines with measurable quality gates and controlled promotion
Local-first developer workflows that map cleanly to cloud and Kubernetes production patterns

All public repositories use educational, synthetic, or non-sensitive data only.

What I build

AI workflow platforms and orchestration runtimes
Tool-using agent and retrieval-augmented systems
LLM pretraining, fine-tuning, and RLHF-style alignment workflows
ML and LLM release governance pipelines
Observability-first platform services

Featured Projects

1) Agentic Workflow Studio

A full-stack platform for authoring and running AI-powered workflows through chat and a visual DAG editor, backed by typed execution contracts, reusable capabilities, memory, triggers, and Kubernetes-native orchestration.

Highlights

Chat, Compose, and Workflow Studio surfaces for conversational, goal-driven, and manually authored workflows
Typed planner/worker runtime with reusable capabilities, memory integration, control-flow nodes, retries, and DLQ recovery
Kubernetes-ready scaling, artifact/document handling, and observability with Prometheus, Grafana, Loki, and Jaeger

Repo: https://github.com/narendersurabhi/agentic-workflow-studio

2) MLOps Release Gate Agent

A Kubernetes-first release governance project focused on controlled promotion, plan/execute workflows, policy gating, and observable ML operations.

Highlights

Release-gate workflow for evaluation, approval, and promotion
MLflow-backed artifacts and runtime decisioning
Prometheus and OpenTelemetry instrumentation with Jaeger tracing

Repo: https://github.com/narendersurabhi/mlops-release-gate-agent

3) MCP Control Plane

A production-style MCP server with multiple transports, auth policy controls, and observability built in from the start.

Highlights

Stdio and HTTP MCP transport support
Scope-based bearer-token authorization model
OpenTelemetry and Prometheus instrumentation for runtime visibility

Repo: https://github.com/narendersurabhi/mcp-control-plane

4) LLM Customization Ops

An end-to-end repository for LLM customization workflows spanning fine-tuning, RLHF-style preference optimization, evaluation, and serving.

Highlights

LoRA and QLoRA supervised fine-tuning flows
DPO-style preference optimization and evaluation gates
Production-facing API and observability patterns for model serving

Repo: https://github.com/narendersurabhi/llm-customization-ops

5) ML Platform Release Gates

A reference implementation for model governance and promotion across training, evaluation, registry flow, and serving.

Highlights

Train, evaluate, promote, and serve workflow
Artifact and model-registry oriented release path
FastAPI serving with Prometheus and Grafana-compatible metrics

Repo: https://github.com/narendersurabhi/ml-platform-release-gates

Selected Strengths

Contracts-first design for predictable component boundaries
LLM lifecycle coverage from training and post-training through evaluation and serving
Evidence-based release decisions using measurable criteria
Operability as a hard requirement, not a follow-on task
Reproducible local and CI workflows with deterministic test paths

Core Stack

Platform and API: Python, FastAPI, Docker, Kubernetes, Helm, MLflow
Observability: OpenTelemetry, Prometheus, Grafana, Jaeger, Loki
LLM systems: pretraining concepts, supervised fine-tuning, RLHF and preference optimization, RAG, tool-calling agents, workflow orchestration, evaluation pipelines
ML and data: PyTorch, XGBoost, PySpark, classical ML pipelines

Open To

Applied AI Engineer
AI Platform Engineer
ML Platform Engineer

Connect

LinkedIn: https://www.linkedin.com/in/narendersurabhi
GitHub: https://github.com/narendersurabhi
Location: Okemos, MI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Narender Rao Surabhi narendersurabhi

Achievements

Achievements

Block or report narendersurabhi

Narender Rao Surabhi

What I build

Featured Projects

1) Agentic Workflow Studio

2) MLOps Release Gate Agent

3) MCP Control Plane

4) LLM Customization Ops

5) ML Platform Release Gates

Selected Strengths

Core Stack

Open To

Connect

Pinned Loading

Uh oh!