forked from mnaumovfb/inference
-
Notifications
You must be signed in to change notification settings - Fork 1
@dkorchevgithub adding lines suggested by @EtoDemerzel0427 to fix "g… #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dkorchevgithub
wants to merge
642
commits into
dkorchevgithub:master
Choose a base branch
from
mlcommons:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Update GNN reference implementation: add DGL backend * [Automated Commit] Format Codebase * Update README.md
… error in DLRMv2 implementation #1909, fixes preprocess_submission (#1910) * Update generate_final_report.py * Fix sdxl (#1911) * Fix typo in fid_score.py, fail_safe for SDXL short runs * [Automated Commit] Format Codebase * Fix typo in fid_score.py, fail_safe for SDXL short runs * Fix dlrmv2 reference implementation | Update run_local.sh * Fixes for filtering invalid results * [Automated Commit] Format Codebase * Update preprocess_submission.py * Added an option to pass in sample_ids.txt for SDXL accuracy check * [Automated Commit] Format Codebase * Update accuracy_coco.py * [Automated Commit] Format Codebase * Fix typo * Not use default for sample_ids.txt --------- Co-authored-by: arjunsuresh <arjunsuresh@users.noreply.github.com>
Updating the pip packages
* Update generate_final_report.py * Fix sdxl (#1911) * Fix typo in fid_score.py, fail_safe for SDXL short runs * [Automated Commit] Format Codebase * Fix typo in fid_score.py, fail_safe for SDXL short runs * Fix dlrmv2 reference implementation | Update run_local.sh * Fixes for filtering invalid results * [Automated Commit] Format Codebase * Update preprocess_submission.py * Added an option to pass in sample_ids.txt for SDXL accuracy check * [Automated Commit] Format Codebase * Update accuracy_coco.py * [Automated Commit] Format Codebase * Fix typo * Not use default for sample_ids.txt * Update requirements.txt (#1907) Updating the pip packages * Fix preprocess_sudbmission for a bug * Update submission_checker.py | Removed TEST05 --------- Co-authored-by: arjunsuresh <arjunsuresh@users.noreply.github.com>
Signed-off-by: Joyce Brum <joycebrum@google.com> Co-authored-by: Mitchelle Rasquinha <80070689+mrasquinha-g@users.noreply.github.com>
Co-authored-by: Mitchelle Rasquinha <80070689+mrasquinha-g@users.noreply.github.com>
* - Fixed syntax error on list .insert method * Update fid_score.py --------- Co-authored-by: Arjun Suresh <arjun@gateoverflow.com> Co-authored-by: Arjun Suresh <arjunsuresh1987@gmail.com>
Docs update
* 🔄 synced local 'tools/submission/power/power_checker.py' with remote 'compliance/check.py' * [Automated Commit] Format Codebase --------- Co-authored-by: mlcommons-bot <null> Co-authored-by: mlcommons-bot <mlcommons-bot@users.noreply.github.com> Co-authored-by: Mitchelle Rasquinha <80070689+mrasquinha-g@users.noreply.github.com> Co-authored-by: Arjun Suresh <arjunsuresh1987@gmail.com> Co-authored-by: Arjun Suresh <arjun@gateoverflow.com>
https://huggingface.co/docs/huggingface_hub/main/en/guides/cli Co-authored-by: Arjun Suresh <arjunsuresh1987@gmail.com> Co-authored-by: Miro <mirhodak@amd.com>
* Initial codebase llama3-405b reference implementation * Add VLLM backend * Prune llama2 files & Update commands * Update evaluate accuracy script * Fix minor issues * Add Llama3 configuration * [Automated Commit] Format Codebase * Set tensor_parallel_size to 8 * Add requirements.txt * Remove consolidate_results.py * Host dataset + add instructions to download it * [Automated Commit] Format Codebase * Fix dockerfile + update README --------- Co-authored-by: pgmpablo157321 <pgmpablo157321@users.noreply.github.com>
…1948) * Update generate_final_report.py * Fix sdxl (#1911) * Fix typo in fid_score.py, fail_safe for SDXL short runs * [Automated Commit] Format Codebase * Fix typo in fid_score.py, fail_safe for SDXL short runs * Fix dlrmv2 reference implementation | Update run_local.sh * Fixes for filtering invalid results * [Automated Commit] Format Codebase * Update preprocess_submission.py * Added an option to pass in sample_ids.txt for SDXL accuracy check * [Automated Commit] Format Codebase * Update accuracy_coco.py * [Automated Commit] Format Codebase * Fix typo * Not use default for sample_ids.txt * Update requirements.txt (#1907) Updating the pip packages * Fix preprocess_sudbmission for a bug * Update submission_checker.py | Removed TEST05 * Fix to SDXL accuracy output * Added exists checks for rmtree in preprocess_submission script * [Automated Commit] Format Codebase * Delete .github/workflows/format.yml * Delete .github/scripts directory * Update build_wheels.yml | Added src distribution * Update VERSION.txt * Update build_wheels.yml * Update VERSION.txt * Update pyproject.toml * Increment version to 4.1.26 * Update MANIFEST.in * Increment version to 4.1.27 * Update pyproject.toml * Increment version to 4.1.28 * Update build_wheels.yml * Update VERSION.txt * Update accuracy_coco.py * Making sdxl run thread safe * Create format.yml | Run format on push instead of PR --------- Co-authored-by: arjunsuresh <arjunsuresh@users.noreply.github.com>
* Remove GLT implementation for GNN * Update Readme: remove GLT and add checkpoint * Update mlperf.conf file * Dockerize rgat benchmark * Update README * Add gnn accuracy script * Remove mlperf.conf from GNN reference implementation * Add minimun system requirements to README * Fix paths in accuracy script * Fix typing in accuracy script
* Fix llama3 multi-gpu issue * Fix identation * Remove llama2 outdated references * Fix run_evaluation argument
* Submission checker v5.0 * [Automated Commit] Format Codebase --------- Co-authored-by: pgmpablo157321 <pgmpablo157321@users.noreply.github.com> Co-authored-by: Arjun Suresh <arjun@gateoverflow.com>
* Fix loadgen build for version numbers having "0" * Update test-resnet50.yml * Update test-retinanet.yml * Update test-bert.yml
Co-authored-by: Miro <mirhodak@amd.com>
Co-authored-by: Miro <mirhodak@amd.com>
* Fix submission checker for v5.0 rgat * Update submission_checker.py | Updates for v5.0 * [Automated Commit] Format Codebase * Update submission_checker.py | Fixes latency_constraints for v5.0 * [Automated Commit] Format Codebase --------- Co-authored-by: mlcommons-bot <mlcommons-bot@users.noreply.github.com>
* add numbers into labels add * remove euro, cent...etc --------- Co-authored-by: hanyunfan <frank.han@dell.com>
* Updation for documentation page: migration to R2 * fix for llama2 assets download instructions
…patibility (#2344) * [Automated Commit] Format Codebase * Updated tags for submission checker command in docs * Update mobilenets docs * Update main.py * Update main.py * update dataset download commands - waymo calib (#2130) * Merge from Master (#2155) * Update submission_checker.py | Fix open model unit in Results (#2144) * Add Llama 3.1 to special unit dict (#2150) --------- Co-authored-by: Pablo Gonzalez <pablo.gonzalez@factored.ai> * [Automated Commit] Format Codebase * Inference docs - Update model and dataset download commands (#2153) * Update llama2 70b model download docs * changes in model and dataset download commands * add powershell command to get result folder structure (#2156) * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Fix Typo in Interactive Latencies (#2147) (#2225) * Fix Typo in Interactive Latencies * Update submission_checker.py * Fix Typo in Interactive Latencies (#2147) (#2226) * Fix Typo in Interactive Latencies * Update submission_checker.py --------- Co-authored-by: Miro <mirhodak@amd.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Update MLCFlow commands for v5.1 (#2237) * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Update main.py * [Automated Commit] Format Codebase * updating for 5.1-dev (inference doc) * [Automated Commit] Format Codebase * fix typo * [Automated Commit] Format Codebase * Update main.py * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Doc updates (#2292) * improve submission doc * Update index.md * Fix for model and dataset download commands * update submission doc * [Automated Commit] Format Codebase * Update index.md * r2_downloader -> r2-downloader * Update multithreading information about SDXL * [Automated Commit] Format Codebase * .lower() for consistency * [Automated Commit] Format Codebase * updation for llama3_1-8b edge * [Automated Commit] Format Codebase --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Arjun Suresh <arjun@gateoverflow.com> * Add quiet flags to MLC commands (#2309) * Improve docs - submission generation (#2311) * [Automated Commit] Format Codebase * Refactor run_verification scripts to improve OS compatibility * [Automated Commit] Format Codebase * Refactor accuracy verification to use file reading instead of shell commands for compatibility * [Automated Commit] Format Codebase * Update folder structure by removing Nvidia folder --------- Co-authored-by: github-actions <github-actions@github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Arjun Suresh <arjun@gateoverflow.com> Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com> Co-authored-by: Nathan Wasson <nathanw@mlcommons.org> Co-authored-by: Pablo Gonzalez <pablo.gonzalez@factored.ai> Co-authored-by: Miro <mirhodak@amd.com>
…entation (#2381) * Initial commit. * WIP * [Automated Commit] Format Codebase * misc * adding pydantic_typer * offline WIP * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * rename the notebook * clean-up * [Automated Commit] Format Codebase * Downgrade from 3.13 to 3.12 * [Automated Commit] Format Codebase * send the response back to LoadGen one at a time * Move the ownership of the AsyncOpenAI client into Task, and clean up the client, event loop and the event loop thread * [Automated Commit] Format Codebase * fixing typo * [Automated Commit] Format Codebase * allowing --settings.min_duration to take in float or int as seconds * fix lint * [Automated Commit] Format Codebase * Parametrize use_token_latencies * [Automated Commit] Format Codebase * fix typos * [Automated Commit] Format Codebase * update README --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* initial server version * change description for parameter use_token_latencies * [Automated Commit] Format Codebase * changes based on PR comments
* Update submission checker * Add submission structure description in markdown file * Add description of placeholders in submission structure * Update preprocess submission * Add directory structure samples * Structure documentation fixes * Add power to submission structure documentation * Fixes for preprocessing submission * To be reverted: point to custom inference branch * To be reverted: use custom mlc automation branch * To be reverted: use custom branch for automation and inference repo * To be reverted: use custom branch for mlcflow and inference repo * Turn off the action * To be reverted: custom branches for mlcflow and inference repo --------- Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com>
* Initial proposal for VLM - evaluation * address review comments and test hiclass implementation * [Automated Commit] Format Codebase * additional fixes to reviews * [Automated Commit] Format Codebase * address PR comments * [Automated Commit] Format Codebase * add a more detail description of the field dataset.split * Enable exception logging in _query_endpoint_async * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Trigger CI/CD pipeline * Add performance_sample_count_override as a CLI flag. * [Automated Commit] Format Codebase * add json format to queries * [Automated Commit] Format Codebase * added schema file and made necessary changes * [Automated Commit] Format Codebase * refactoring and linting * [Automated Commit] Format Codebase * Add Dockerfile * Add use_guided_decoding to let user choose to use guided_decoding or not. * [Automated Commit] Format Codebase * add f1 scores of uniform random selection * [Automated Commit] Format Codebase * Enabling mlperf-inf-mm-vl2l benchmark vllm. * [Automated Commit] Format Codebase * Commit to trigger the GitHub Actions in inference PR * empty commit --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Shang Wang <samshang.wang@mail.utoronto.ca> Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com> Co-authored-by: Shang Wang <shangw@nvidia.com>
* add changes to brand field * fix format * [Automated Commit] Format Codebase * delete extra check in calculate_brand_f1_score * reflect the latest version of the dataset * [Automated Commit] Format Codebase * change default server_expected_qps to 10 * fix errors in task.py * add format fixes * add request timeout * [Automated Commit] Format Codebase * fix hanging * update readme * small fix * [Automated Commit] Format Codebase * empty commit --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Shang Wang <samshang.wang@mail.utoronto.ca> Co-authored-by: Shang Wang <shangw@nvidia.com>
…t (that we are going to freeze) (#2406) * notebook WIP * Update the notebook to reflect the latest version of the dataset * update * update * [Automated Commit] Format Codebase * empty commit --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Update loadgen to 6.0; Bulk update 6.0 checker bits * Fix qwen3vl
* Add tests to detect failures in igbh dataset download * Update Python version in CI workflow
* text to video ref implementation * update dataset and readme * move files to wan folder
…ast if the underying `vllm serve` process already failed (#2409) * enable VllmDeployer to fail fast if the underying vllm process failed. * example slurm script for submitting jobs * fix slurm scripts * small fix * [Automated Commit] Format Codebase * Update the readme about the example slurm scripts. * Change the default endpoint startup timeout to 1 hour in case someone need to download the model for the fisrt time. * change servr expected qps and target latency * Change the default dataset repo_id to the new name of the public dataset * [Automated Commit] Format Codebase * evaluate the json file with multiprocess * [Automated Commit] Format Codebase * change default server_target_latency to 12 * revert evaluation changeS * [Automated Commit] Format Codebase * update slurm script * update slurm script * revert evaluation.py changes after analysing the discrepancy in is_secondhand f1 score * [Automated Commit] Format Codebase * linting * [Automated Commit] Format Codebase * lock in model and dataset SHA * [Automated Commit] Format Codebase * Specify model quality target and server target latency in the README * Update loadgen/mlperf.conf * aligning TestSettings'C++ code with its python binding * [Automated Commit] Format Codebase * remove ttft and tpot from mlperf.conf * Enable CLI to take in user.conf * [Automated Commit] Format Codebase * readme * readme * rename vl2l -> q3vl * [Automated Commit] Format Codebase * empty * rerun ci * rerun ci * Introduce sampling parameters * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * empty * move CFLAGS="-std=c++14 -O3" into extra_compile_args of Pybind11Extension * [Automated Commit] Format Codebase * enable specifying loadgen source in the Dockerfile * update slurm scripts * Maintain None as the default value for the sampling params * [Automated Commit] Format Codebase * update readme * [Automated Commit] Format Codebase * empty --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: John Calderon <jcalderon@nvidia.com>
* Initial draft for Inference submission guide * Update index.md
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…ood" count (mlcommons#965)
patch for the latest dlrm
updated Docker CPU as suggested in
Issues:
mlcommons#917
mlcommons#604
updated for issues DLRM inference README out of date mlcommons/inference#917 and DLRM: save downloaded and generated files outside of Docker containers mlcommons/inference#604
Update README.md
updating readme
fixed typo noticed by @psyhtest
adding lines suggested by @EtoDemerzel0427 to fix "good" count
Co-authored-by: Anton Lokhmotov psyhtest@users.noreply.github.com
Co-authored-by: rameshchukka rnaidu02@yahoo.com