SageMaker Bencher

SageMaker Bencher offers a collection of scripts to automatically prepare datasets and design, orchestrate, track, as well as analyze complex benchmarking experiments based on SageMaker Training Jobs. With SageMaker Bencher one can specify both rules for dataset preparation and benchmark execution procedure in form of a single YAML-file, which can be shared with anyone who wants to replicate or customize benchmarks by themselves.

The goal that SageMaker Bencher aims to solve is three-fold:

Large-scale user-friendly experimentation -- with SageMaker Bencher you can quickly set up an custom benchmark consisting of hundreds of various trials in a human-friendly format by providing a single configuration file in a declarative manner. This configuration file specifies all the necessary settings required to run and track the experiment, including dataset specification and parallel execution strategy for large-scale orchestration of multiple concurrent trials. Each experiment trial can be further set up to execute its own training script, run on different instance types, use various datasets, leverage diverse data ingestion modes, or benchmark any other (hyper-)parameters that one would like to test.
Convenient experiment tracking -- all trial components of an experiment (including input parameters, code versions, metrics, and output files) are automatically tracked in SageMaker Experiments for subsequent lineage tracking and result analysis.
Easy sharing and one-click reproducibility -- one of the obstacles when trying to reproduce the published benchmark results is the necessity to first download and then prepare the required dataset in some specific way to ensure "apples to apples" comparison of the results. With SageMaker Bencher you can automatically download, create, and store datasets on S3 buckets according to various preparation rules before starting benchmarks, thereby enabling others to easily replicate your benchmark results.

Setup

Before launching an experiment with SageMaker Bencher, there are a few things that we'd need to set up.

Install all required Python packages

Make sure that you have all the required packages by running

pip install -r ./requirements.txt

IAM Execution Role

For convience we advice to create a new IAM role named SageMakerRoleBenchmark with these two policies attached:

AmazonS3FullAccess
AmazonSageMakerFullAccess

Of course, one can pick any name for the IAM role, just remember to adjust the experiment config file accordingly.

(optional) Setup FSx for Lustre filesystem

In case you intend to use FSx for Lustre as a data source in your benchmarks, you will need to do the following

First create a VPC, subnets, security groups, and S3 endpoint in the region where you plan to run benchamrks with: ./fsx_setup/stack-sm.sh ${REGION} ${S3_BUCKET_NAME}.
- Example: ./fsx_setup/stack-sm.sh eu-central-1 sagemaker-benchmark-eu-central-1-<your-AWS-ID>
Then, make sure to write down the output subnet and security group IDs.
Create the FSx filesystem with ./fsx/stack-fsx.sh ${REGION} ${S3_BUCKET_NAME}/ ${FILESYSTEM_CAPACITY} ${SUBNET} ${SECURITY_GROUP}.
- Example: ./fsx_setup/stack-fsx.sh eu-central-1 s3://sagemaker-benchmark-eu-central-1-<your-AWS-ID>/ 3600 subnet-XXXXXXXXXXXXXXXXX sg-XXXXXXXXXXXXXXXXX.
Observe the resulting FSxFileSystemId value. Then specify the following fields in the experiment config file: subnets, security_group_ids, and fsx_file_system_id.
Now you can set input_mode: fsx in the trial definition to use the FSx filesystem that we have provisioned above as the corresponding data ingestion mechanism.

Run

Launching a benchmarking experiment can either be done programmatically, or by using one of the small launcher scripts:

(Option A) run bencher.py and specify the experiment config file after the -f/--file flag.

Example: python bencher.py -f path/to/experiment.yml

(Option B) start benchmarking directly from bencher.ipynb notebook (e.g. in SageMaker Studio).

Note: make sure that the default region of your AWS CLI is consistent with the region specified in the experiment configuration file.

An End-to-End Example

As an end-to-end example, let us replicate the benchmarks from the AWS ML blog post "Choose the best data source for your Amazon SageMaker training job", where, given various dataset compositions, the authors tested different native SageMaker data ingestion options for image classification use-case with ResNet-50 model (see blog post for details). Concretely, the authors conducted an experiment consisting of 12 trials (and each trial repeated 3 times) to test the following combinations:

Input modes – FSx for Lustre, File mode, FastFile mode
Dataset size – Smaller dataset (1 GB), larger dataset (54 GB)
Individual file sizes – Smaller files (JPGs, approximately 39 KB), Larger files (TFRecord, approximately 110 MB)

For convenience, full specification for all of the benchmarks performed in the blog can be found in form of a SageMaker Bencher experiment configuration file in experiments/blog-benchmarks-all.yml. So that the entire experiment consistent of 12 trials can be easily replicated by running:

python bencher.py -f experiments/blog-benchmarks-all.yml

NOTE

See the Experiment Configuration File Guide for details on structure of experimentation config files.

Once started, the results of every conducted trial will be fully logged using SageMaker Experiments, as illustrated below.

When benchmark is finished, the results can be conveniently pulled down to, e.g., a Pandas DataFrame and easily analysed with any tools of your choice. As an example, see result_analysis/analyze-and-plot-benchmarks.ipynb notebook to replicate the benchmark analysis and plots discussed in the mentioned blog post, or produce your own additional plots:

End-to-End Tests

In order to ensure that everything is set up correctly (python packages, IAM roles, optional FSx volumes, etc.), one can quickly run a set of tests:

python bencher.py -f experiments/test/test-caltech.yml - to test different SageMaker data ingestion mechanisms on Caltech dataset
python bencher.py -f experiments/test/test-synthetic.yml - to test different SageMaker data ingestion mechanisms on synthetic dataset
python bencher.py -f experiments/test/test-s3prefix.yml - to test different SageMaker data ingestion mechanisms on any generic dataset stored on S3 (the dataset has to be manually created and uploaded to the specified S3 location)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
datasets		datasets
doc		doc
experiments		experiments
fsx_setup		fsx_setup
result_analysis		result_analysis
sagemaker_bencher		sagemaker_bencher
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
bencher.ipynb		bencher.ipynb
bencher.py		bencher.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SageMaker Bencher

Setup

Install all required Python packages

IAM Execution Role

(optional) Setup FSx for Lustre filesystem

Run

An End-to-End Example

End-to-End Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

aws-samples/sagemaker-bencher

Folders and files

Latest commit

History

Repository files navigation

SageMaker Bencher

Setup

Install all required Python packages

IAM Execution Role

(optional) Setup FSx for Lustre filesystem

Run

An End-to-End Example

End-to-End Tests

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages