Skip to content
View rishabhsinha17's full-sized avatar
  • University of California, Berkeley
  • Seattle, WA
  • 14:24 (UTC -07:00)
  • LinkedIn in/rishabhsinha17

Block or report rishabhsinha17

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rishabhsinha17/README.md

💫 About Me:

🔭 I’m currently working on
Low-latency LLM infrastructure at Amazon Web Services, building systems with SigV4 + LDAP authentication for production clients. I also optimize distributed observability pipelines across 150+ AWS services using OAuth-backed sync platforms, ECS Fargate, and Docker.

👯 I’m looking to collaborate on
Projects involving HPC, scalable machine learning systems, or efficient inference backends—especially ones pushing token throughput, optimizing distributed performance, or innovating in edge deployment.

🤝 I’m looking for help with
Advanced GPU kernel-level tuning and low-latency optimization techniques for multi-modal LLMs. Also interested in learning more about quant trading infra or novel compression algorithms for AI/ML inference.

🌱 I’m currently learning
Deep dive into distributed training frameworks, especially gradient checkpointing and tensor parallelism for multimodal models. Also brushing up on real-time streaming data systems and reinforcement learning for ops optimization.

💬 Ask me about

Achieving 120 tokens/sec inference on Llama-3 using vLLM
Cutting runtime by 45 seconds at NIST with multiprocessing and Numba
Engineering sub-5 ms auth for AWS clients with Coral + Guice
Developing GraphQL backends and Kafka pipelines at Clinia

⚡ Fun fact
I once boosted image processing throughput by 720% using a 72-core EC2 HPC setup—on my own—and I still have the benchmark logs to prove it.

🌐 Socials:

LinkedIn email

💻 Tech Stack:

C C# C++ CSS3 Go GraphQL HTML5 Java JavaScript OCaml PHP Rust Ruby R Python Scala Bash Script TypeScript Firebase Datadog Vercel AWS .Net Apache Hadoop Apache Kafka Bootstrap Elasticsearch nVIDIA Express.js FastAPI Flask jQuery OpenCV NodeJS Next JS Redux React Hook Form React Router React Query React Native React Spring Vue.js TailwindCSS Apache Maven Apache Airflow Apache Jenkins Nginx MySQL MongoDB Redis AmazonDynamoDB Firebase Keras Matplotlib mlflow NumPy Pandas Plotly PyTorch scikit-learn Scipy TensorFlow GitHub Git Gradle Kubernetes Jira Postman Terraform Python MySQL Go OpenCV AWS Docker Jira Apache Kafka Apache Hadoop

📊 GitHub Stats:



🏆 GitHub Trophies

🔝 Top Contributed Repo


Pinned Loading

  1. low-latency-llm-inference-server low-latency-llm-inference-server Public

    Production-grade stack delivering 120 tokens / s from Llama-3-8B with 40 % lower p99 latency under 32-request concurrency.

    C#

  2. rl-hyperparam-tuner rl-hyperparam-tuner Public

    End‑to‑end prototype that trains a ResNet‑18 on CIFAR‑10 while a PPO agent dynamically adjusts learning rate. Metrics are logged to PostgreSQL via PySpark and visualized with a Grafana dashboard. A…

    Python 1

  3. slurm-vision-rag-platform slurm-vision-rag-platform Public

    End‑to‑end reference implementation for a vision RAG pipeline fine‑tuned on LLaVA‑1.5‑7B and served via FastAPI.

    Python

  4. Real-Time-Twitter-Stock-Sentiment-Transformer-Model Real-Time-Twitter-Stock-Sentiment-Transformer-Model Public

    The Real-Time Twitter Stock Sentiment Analysis used Python, Transformers, and Twitter API to analyze stock sentiments from real-time tweets. It involved data acquisition, preprocessing, Transformer…

    Python 1

  5. Real-time-Sign-Language-Recognition-Using-OpenCV-and-Deep-Learning Real-time-Sign-Language-Recognition-Using-OpenCV-and-Deep-Learning Public

    Employed OpenCV for video processing and hand-detection in real-time. Utilized Keras with TensorFlow backend to train a deep learning model for sign language classification on a dataset of 2900 300…

    Python 1

  6. Lorenz-System-Attractor-Singular-Value-Decomposition-Complex-Systems-Research-Using-Python Lorenz-System-Attractor-Singular-Value-Decomposition-Complex-Systems-Research-Using-Python Public

    This Python project computes the singular value decomposition of the trajectory matrix of a lorenz system attractor. The python program creates a three dimensional plot of the trajectory matrix of …

    Python