Skip to content
Change the repository type filter

All

    Repositories list

    • axs

      Public
      KRAI X workflow automation system
      Python
      MIT License
      2541Updated Mar 9, 2026Mar 9, 2026
    • inference

      Public
      Reference implementations of inference benchmarks
      Python
      Apache License 2.0
      612000Updated Mar 6, 2026Mar 6, 2026
    • Automated KRAI X workflows for reproducing MLPerf Inference submissions
      Python
      MIT License
      21120Updated Mar 5, 2026Mar 5, 2026
    • axs2stg

      Public
      Python
      MIT License
      0001Updated Feb 27, 2026Feb 27, 2026
    • axs2kiss

      Public
      Automated KRAI-X workflows for inference engines on selected backends: vLLM and SGLang on CUDA and ROCm, NIM/TensorRT-LLM on CUDA, using an OpenAI API compatibl…
      MIT License
      0000Updated Feb 20, 2026Feb 20, 2026
    • NeMo

      Public
      A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognitio…
      Python
      Apache License 2.0
      3.4k000Updated Sep 24, 2025Sep 24, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      14k000Updated Aug 26, 2025Aug 26, 2025
    • kilt4qaic

      Public
      MIT License
      0000Updated Jul 8, 2025Jul 8, 2025
    • C++
      BSD 3-Clause "New" or "Revised" License
      41000Updated May 14, 2025May 14, 2025
    • axs2gcp

      Public
      Automated KRAI X workflows for Google Cloud Platform
      Python
      MIT License
      0400Updated Apr 2, 2025Apr 2, 2025
    • axs2qaic

      Public
      Automated KRAI X workflows for reproducing MLPerf Inference submissions on systems equipped with Qualcomm Cloud AI 100 accelerators
      Python
      MIT License
      0000Updated Dec 12, 2024Dec 12, 2024
    • KILT (KRAI Inference Library Technology) - proudly powering some of the fastest and most energy efficient submissions in the history of MLPerf Inference
      C++
      MIT License
      1100Updated Dec 11, 2024Dec 11, 2024
    • Automated KRAI X workflows for reproducing MLPerf Inference submissions
      Python
      MIT License
      0000Updated Dec 11, 2024Dec 11, 2024
    • Building Docker images for reproducing MLPerf Inference submissions with Qualcomm Cloud AI 100 accelerators
      Shell
      Other
      0000Updated Dec 11, 2024Dec 11, 2024
    • axs2kilt

      Public
      Automated KRAI X workflows for reproducing MLPerf Inference submissions powered by KRAI Inference Library Technology (KILT)
      Python
      MIT License
      1100Updated Dec 11, 2024Dec 11, 2024
    • kilt4uai

      Public
      A plugin for KILT (KRAI Inference Library Technology) for integration with Untether AI's imAIgine SDK
      C++
      MIT License
      0000Updated Aug 6, 2024Aug 6, 2024
    • axs2uai

      Public
      Automated KRAI X workflows for reproducing MLPerf Inference submissions on systems with Untether AI's speedAI at-memory compute inference accelerators
      Python
      MIT License
      0000Updated Aug 6, 2024Aug 6, 2024
    • Automated KRAI X workflows for reproducing MLPerf Inference submissions on systems withAutomated KRAI X workflows for reproducing MLPerf Inference submissions w…
      MIT License
      0000Updated Jul 26, 2024Jul 26, 2024
    • policies

      Public
      General policies for MLPerf™ including submission rules, coding standards, etc.
      Python
      Apache License 2.0
      61000Updated Jun 11, 2024Jun 11, 2024
    • Issues related to MLPerf™ Inference policies, including rules and suggested changes
      Apache License 2.0
      56000Updated Apr 2, 2024Apr 2, 2024
    • Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high throughput and low late…
      Jupyter Notebook
      Other
      17000Updated Mar 12, 2024Mar 12, 2024
    • MIT License
      0000Updated Jan 29, 2024Jan 29, 2024
    • power-dev

      Public
      Dev repo for power measurement for the MLPerf benchmarks
      Python
      Apache License 2.0
      27000Updated Jan 26, 2024Jan 26, 2024
    • LLM_Wiki

      Public
      This is just a place for us to put whatever we’ve learnt about LLMs, be it papers, blog posts or our own experiences.
      0100Updated Nov 29, 2023Nov 29, 2023
    • TextToJSONConverter
      HTML
      0000Updated Oct 25, 2023Oct 25, 2023
    • TextLineConverter for NLP
      HTML
      0000Updated Oct 25, 2023Oct 25, 2023
    • This repository contains the results and code for the MLPerf™ Inference v3.1 benchmark.
      Apache License 2.0
      13000Updated Oct 10, 2023Oct 10, 2023
    • Prune a model while finetuning or training.
      Jupyter Notebook
      Apache License 2.0
      62000Updated Sep 26, 2023Sep 26, 2023
    • axs2snpe

      Public
      MIT License
      0000Updated Sep 5, 2023Sep 5, 2023
    • Python
      MIT License
      0000Updated Aug 18, 2023Aug 18, 2023