deep-person-reid

This repo contains pytorch implementations of deep person re-identification models.

Pretrained models are available.

We will actively maintain this repo to incorporate new models.

Install

cd to the folder where you want to download this repo.
run git clone https://github.com/KaiyangZhou/deep-person-reid.

Prepare data

Create a directory to store reid datasets under this repo via

cd deep-person-reid/
mkdir data/

Market1501 [7]:

download dataset to data/ from http://www.liangzheng.org/Project/project_reid.html.
extract dataset and rename to market1501.

MARS [8]:

create a directory named mars/ under data/.
download dataset to data/mars/ from http://www.liangzheng.com.cn/Project/project_mars.html.
extract bbox_train.zip and bbox_test.zip.
download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put info/ in data/mars. (we want to follow the standard split in [8])

Dataset loaders

These are implemented in dataset_loader.py where we have two main classes that subclass torch.utils.data.Dataset:

ImageDataset: processes image-based person reid datasets.
VideoDataset: processes video-based person reid datasets.

These two classes are used for torch.utils.data.DataLoader that can provide batched data. Data loader wich ImageDataset outputs batch data of (batch, channel, height, width), while data loader with VideoDataset outputs batch data of (batch, sequence, channel, height, width).

Models

models/ResNet.py: ResNet50 [1], ResNet50M [2].
models/DenseNet.py: DenseNet121 [3].

Loss functions

xent: cross entropy + label smoothing regularizer [5].
htri: triplet loss with hard positive/negative mining [4] .

We use Adam [6] everywhere, which turned out to be the most effective optimizer in our experiments.

Train

Training codes are implemented mainly in

train_img_model_xent.py: train image model with cross entropy loss.
train_img_model_xent_htri.py: train image model with combination of cross entropy loss and hard triplet loss.
train_vid_model_xent.py: train video model with cross entropy loss.
train_vid_model_xent_htri.py: train video model with combination of cross entropy loss and hard triplet loss.

For example, to train an image reid model using ResNet50 and cross entropy loss, run

python train_img_model_xent.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 0

Then, you will see

==========
Args:Namespace(arch='resnet50', dataset='market1501', eval_step=20, evaluate=False, gamma=0.1, gpu_devices='0', height=256, lr=0.0003, max_epoch=60, print_freq=10, resume='', save_dir='log/resnet50/', seed=1, start_epoch=0, stepsize=20, test_batch=32, train_batch=32, use_cpu=False, weight_decay=0.0005, width=128, workers=4)
==========
Currently using GPU 0
Initializing dataset market1501
=> Market1501 loaded
Dataset statistics:
  ------------------------------
  subset   | # ids | # images
  ------------------------------
  train    |   751 |    12936
  query    |   750 |     3368
  gallery  |   751 |    15913
  ------------------------------
  total    |  1501 |    32217
  ------------------------------
Initializing model: resnet50
Model size: 25.04683M
==> Epoch 1/60
Batch 10/404     Loss 6.665115 (6.781841)
Batch 20/404     Loss 6.792669 (6.837275)
Batch 30/404     Loss 6.592124 (6.806587)
... ...
==> Epoch 60/60
Batch 10/404     Loss 1.101616 (1.075387)
Batch 20/404     Loss 1.055073 (1.075455)
Batch 30/404     Loss 1.081339 (1.073036)
... ...
==> Test
Extracted features for query set, obtained 3368-by-2048 matrix
Extracted features for gallery set, obtained 15913-by-2048 matrix
Computing distance matrix
Computing CMC and mAP
Results ----------
mAP: 68.8%
CMC curve
Rank-1  : 85.4%
Rank-5  : 94.1%
Rank-10 : 95.9%
Rank-20 : 97.2%
------------------
Finished. Total elapsed time (h:m:s): 1:57:44

To use multiple GPUs, you can set --gpu-devices 0,1,2,3.

Please run python train_blah_blah.py -h for more details regarding arguments.

Results

Image person reid

Market1501

Model	Size (M)	Loss	Rank-1/5/10 (%)	mAP (%)	Model weights	Published Rank	Published mAP
DenseNet121	7.72	xent	86.5/93.6/95.7	67.8	download
DenseNet121	7.72	xent+htri	89.5/96.3/97.5	72.6	download
ResNet50	25.05	xent	85.4/94.1/95.9	68.8	download	87.3/-/-	67.6
ResNet50	25.05	xent+htri	87.5/95.3/97.3	72.3	download
ResNet50M	30.01	xent	89.0/95.5/97.3	75.0	download	89.9/-/-	75.6
ResNet50M	30.01	xent+htri	90.4/96.7/98.0	76.6	download

Video person reid

MARS

Model	Size (M)	Loss	Rank-1/5/10 (%)	mAP (%)	Model weights
DenseNet121	7.59	xent+htri	82.6/93.2/95.4	74.6	download
ResNet50	24.79	xent	74.5/88.8/91.8	64.0	download
ResNet50	24.79	xent+htri	80.8/92.1/94.3	74.0	download
ResNet50M	29.63	xent	77.8/89.8/92.8	67.5	download
ResNet50M	29.63	xent+htri	82.3/93.8/95.3	75.4	download

Test

Say you have downloaded ResNet50 trained with xent on market1501. The path to this model is 'saved-models/resnet50_xent_market1501.pth.tar' (create a directory to store model weights mkdir saved-models/). Then, run the following command to test

python train_img_model_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 32

Likewise, to test video reid model, you should have a pretrained model saved under saved-models/, e.g. saved-models/resnet50_xent_mars.pth.tar, then run

python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2

Note that --test-batch in video reid represents number of tracklets. If we set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.

References

[1] He et al. Deep Residual Learning for Image Recognition. CVPR 2016.
[2] Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.
[3] Huang et al. Densely Connected Convolutional Networks. CVPR 2017.
[4] Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
[5] Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.
[6] Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.
[7] Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.
[8] Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_manager.py		data_manager.py
dataset_loader.py		dataset_loader.py
eval_metrics.py		eval_metrics.py
losses.py		losses.py
samplers.py		samplers.py
train_img_model_xent.py		train_img_model_xent.py
train_img_model_xent_htri.py		train_img_model_xent_htri.py
train_vid_model_xent.py		train_vid_model_xent.py
train_vid_model_xent_htri.py		train_vid_model_xent_htri.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

deep-person-reid

Install

Prepare data

Dataset loaders

Models

Loss functions

Train

Results

Image person reid

Market1501

Video person reid

MARS

Test

References

About

Uh oh!

Releases

Packages

Languages

License

December-boy/deep-person-reid

Folders and files

Latest commit

History

Repository files navigation

deep-person-reid

Install

Prepare data

Dataset loaders

Models

Loss functions

Train

Results

Image person reid

Market1501

Video person reid

MARS

Test

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages