This project is no longer maintained. Archived for reference.
Mac Studio 환경에서 로컬 LLM 성능을 벤치마크하는 테스트 도구입니다.
-
Install Dependencies: Ensure you have Python installed. Install the required Python packages using pip:
pip install -r requirements.txt
(Note: This project may have some Node.js related files, but Node.js specific settings are not required for the core performance testing.)
-
Run Performance Test: Execute the main test script:
python test_performance.py
Performance test results are logged to performance_test.log. This file contains details about each test run, including response times and memory usage.
.gitignore: Specifies intentionally untracked files that Git should ignore (e.g., log files, generated CSVs,node_modules).requirements.txt: Lists the Python dependencies required for the project.test_performance.py: The main script for running performance tests.
Performance test results for local LLM configurations using mlx-omni-server and Ollama.
- Single-turn Test:
- Total Inference Time: 63.22s
- Multi-turn Test:
- Turn 1 Inference Time: 10.21s
- Turn 2 Inference Time: 9.01s
- Overall Averages (Successful Tests):
- Average Inference Time: 27.48s
- Average Token Generation Speed: 43.78 tokens/s
- Single-turn Test:
- Total Inference Time: 120.67s
- Multi-turn Test:
- Turn 1 Inference Time: 7.65s
- Turn 2 Inference Time: 8.43s
- Overall Averages (Successful Tests):
- Average Inference Time: 45.58s
- Average Token Generation Speed: 37.57 tokens/s