kirtzh Kairatzh

Professional Profile

ML/NLP/LLM Engineer with expertise in AI Systems Architecture, Machine Learning, and Deep Learning. Specialized in building scalable AI systems, developing classical ML/DL models, implementing traditional NLP solutions, integrating large language models into production environments, and managing full development lifecycle from architecture to deployment.

Core competency lies in combining modern approaches (LLM, multi-agent systems, RAG) with proven classical ML and DL methodologies to ensure system stability, predictability, and high performance.

Areas of Expertise

LLM Engineering

Development of RAG and GraphRAG systems
Model fine-tuning (LoRA, QLoRA, PEFT) for domain-specific applications
Inference optimization (vLLM, TensorRT, llama.cpp, Ollama)
Advanced prompt engineering (Zero-shot, Few-shot, CoT, ReAct, Planning)

Multi-Agent Systems

Multi-agent system architecture (LangGraph, AutoGEN, Planning Agents, Langchain)
Agent integration with APIs and external services
Dynamic tool selection systems

Classical Machine Learning

Regression models (Linear, Ridge, Lasso) and classification algorithms (Logistic Regression, SVM, Decision Trees, Random Forest)
Ensemble methods (Gradient Boosting, XGBoost, LightGBM, CatBoost)
Clustering techniques (K-Means, DBSCAN, Hierarchical Clustering)
Feature engineering, hyperparameter tuning, model validation

Deep Learning

Neural network development and training with PyTorch (MLP, CNN, RNN, LSTM, GRU)
Transfer learning and fine-tuning of pre-trained models (ResNet, EfficientNet, BERT)
Architecture optimization, regularization, scheduler implementation
Large-scale dataset handling and GPU-accelerated training

Classical NLP

Text preprocessing: tokenization, stemming, lemmatization, stop-word removal
Text vectorization (Bag-of-Words, TF-IDF, Word2Vec, FastText, GloVe)
Text classification, sentiment analysis, topic modeling (LDA)
Chatbot and dialogue system development using traditional NLP methods
Integration of NLTK, spaCy, gensim into ML projects

Backend & API Development

REST API development with FastAPI
Data storage and caching with PostgreSQL and Redis
API optimization for high-load environments

MLOps & Production

Containerization (Docker, Docker Compose)
CI/CD pipelines (GitHub Actions, GitLab CI)
Model monitoring, logging, and management (MLFlow, LangSmith)

Vector Search & Databases

Implementation and optimization of vector search (ChromaDB, Pinecone, Weaviate, FAISS)
Hybrid search system development

Key Achievements

Implemented Enterprise RAG system with corporate process integration and hybrid search support
Developed multi-agent platform using LangGraph for educational process automation
Built GraphRAG Knowledge System utilizing Neo4j and LLM for semantic search
Developed and deployed classical ML models for price prediction, data classification, and risk assessment
Trained and optimized CNN and LSTM architectures for image analysis and sequence processing tasks
Mentored junior engineers, established development standards, conducted code reviews
Successfully transitioned multiple AI products from prototype to stable production deployment

Professional Experience

Tanym (Astana) | NLP/LLM Engineer
December 2024 — Present

Lead developer of NLP/LLM modules in AI assistant platform
Multi-agent system development and LLM integration into educational workflows
RAG pipeline implementation, API development, and service containerization
Inference optimization and generation quality enhancement

Technical Stack

Programming Languages: Python (async-first, typing, pydantic v2, dependency injection, clean architecture), SQL (query optimization, indexes, transactions), C++ (performance-critical inference, bindings)

ML / DL Frameworks: PyTorch (production training & fine-tuning), PyTorch Lightning, Hugging Face Transformers, Accelerate, PEFT (LoRA / QLoRA), scikit-learn (baselines & evaluation), XGBoost, LightGBM, CatBoost, numpy, pandas / polars

LLM Frameworks & Orchestration: LangChain (production pipelines, integrations), LangGraph (stateful agents), AutoGEN (multi-agent research & prototyping), OpenAI API, Hugging Face Inference, vLLM (serving integration)

NLP / Text Processing: spaCy (production NLP), NLTK (legacy & preprocessing), Embeddings (dense & domain-specific), Tokenization & chunking strategies, TF-IDF (baselines), Word2Vec, FastText, Text normalization & deduplication

Vector Search & Retrieval: ChromaDB (local & prototyping), FAISS (low-level vector search), Pinecone (managed vector DB), Weaviate (schema-aware vector search), pgvector, Hybrid search (BM25 + dense), Cross-encoder reranking

Databases & Caching: PostgreSQL (primary OLTP store), Redis (cache, rate limits, session memory)

MLOps / AI Ops: Docker (multi-stage builds), Docker Compose, GitHub Actions (CI/CD), MLflow (experiments & model registry), ClearML (pipeline orchestration), LangSmith (LLM tracing & eval), Environment-based config, Rollback-ready deployments

Inference & Performance Optimization: vLLM (high-throughput LLM serving), TensorRT (GPU optimization), Quantization (AWQ / GPTQ), Dynamic batching, KV-cache reuse, Streaming inference, llama.cpp, Ollama (local & edge inference)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly