diff --git a/.gitignore b/.gitignore index 5563689..2c245d8 100644 --- a/.gitignore +++ b/.gitignore @@ -27,3 +27,4 @@ yolov4_training/yolov4.conv.137 yolov4_training/build_docker.sh yolov4_training/dockerfile_tmp yolov4_training/yolov4.conv.137 +det-demo-tmi/voc_dog diff --git a/README.MD b/README.MD index bcba683..50ace8d 100644 --- a/README.MD +++ b/README.MD @@ -1,105 +1,114 @@ -# ymir-executor 使用文档 +# ymir-executor documentation [English](./README.MD) | [简体中文](./README_zh-CN.MD) -## det-yolov4-training +- [ymir](https://github.com/IndustryEssentials/ymir) -- yolov4的训练镜像,采用mxnet与darknet框架,默认cuda版本为`10.1`,无法直接在高版本显卡如GTX3080/GTX3090上运行,需要修改dockerfile将cuda版本提升为11.1以上,并修改其它依赖。 +- [wiki](https://github.com/modelai/ymir-executor-fork/wiki) -## det-yolov4-mining +- [ymir executor](./docs/official-docker-image.md) -- yolov4挖掘与推理镜像,与det-yolov4-training对应 +- [ymir mining algorithm](./docs/mining-images-overview.md) -## det-yolov5-tmi - -- yolov5训练、挖掘及推理镜像,训练时会从github上下载权重 - -- yolov5-FAQ - - - 镜像训练时权重下载出错或慢:提前将权重下载好并复制到镜像`/app`目录下或通过ymir导入预训练模型,在训练时进行加载。 +## overview -## live-code-executor +| docker image | [finetune](https://github.com/modelai/ymir-executor-fork/wiki/use-yolov5-to-finetune-or-training-model) | tensorboard | args/cfg options | framework | onnx | pretrained weights | +| - | - | - | - | - | - | - | +| yolov4 | ? | ✔️ | ❌ | darknet + mxnet | ❌ | local | +| yolov5 | ✔️ | ✔️ | ✔️ | pytorch | ✔️ | local+online | +| yolov7 | ✔️ | ✔️ | ✔️ | pytorch | ❌ | local+online | +| mmdetection | ✔️ | ✔️ | ✔️ | pytorch | ❌ | local+online | +| detectron2 | ✔️ | ✔️ | ✔️ | pytorch | ❌ | online | +| vidt | ? | ✔️ | ✔️ | pytorch | ❌ | online | +| nanodet | ✔️ | ✔️ | ❌ | pytorch_lightning | ❌ | local+online | -- 可以通过`git_url`, `git_branch`从网上clone代码到镜像并运行 - -- 参考 [live-code](https://github.com/IndustryEssentials/ymir-remote-git) - -## det-mmdetection-tmi +- `online` pretrained weights may download through network -- mmdetection 训练、挖掘及推理镜像,目前还没开发完 +- `local` pretrained weights have copied to docker images when building image +### benchmark -## 如何制作自己的ymir-executor +- training dataset: voc2012-train 5717 images +- validation dataset: voc2012-val 5823 images +- image size: 640 -- [ymir-executor 制作指南](https://github.com/IndustryEssentials/ymir/blob/dev/docs/ymir-dataset-zh-CN.md) +gpu: single Tesla P4 -## 如何导入预训练模型 +| docker image | batch size | epoch number | model | voc2012 val map50 | training time | note | +| - | - | - | - | - | - | - | +| yolov5 | 16 | 100 | yolov5s | 70.05% | 9h | coco-pretrained | +| vidt | 2 | 100 | swin-nano | 54.13% | 2d | imagenet-pretrained | +| yolov4 | 4 | 20000 steps | yolov4 | 66.18% | 2d | imagenet-pretrained | +| yolov7 | 16 | 100 | yolov7-tiny | 70% | 8h | coco-pretrained | -- [如何导入外部模型](https://github.com/IndustryEssentials/ymir/blob/dev/docs/import-extra-models.md) +gpu: single GeForce GTX 1080 Ti - - 通过ymir网页端的 `模型管理/模型列表/导入模型` 同样可以导入模型 +| docker image | image size | batch size | epoch number | model | voc2012 val map50 | training time | note | +| - | - | - | - | - | - | - | - | +| yolov4 | 608 | 64/32 | 20000 steps | yolov4 | 72.73% | 6h | imagenet-pretrained | +| yolov5 | 640 | 16 | 100 | yolov5s | 70.35% | 2h | coco-pretrained | +| yolov7 | 640 | 16 | 100 | yolov7-tiny | 70.4% | 5h | coco-pretrained | +| mmdetection | 640 | 16 | 100 | yolox_tiny | 66.2% | 5h | coco-pretrained | +| detectron2 | 640 | 2 | 20000 steps | retinanet_R_50_FPN_1x | 53.54% | 2h | imagenet-pretrained | +| nanodet | 416 | 16 | 100 | nanodet-plus-m_416 | 58.63% | 5h | imagenet-pretrained | --- -# FAQ +# build ymir executor -- apt 或 pip 安装慢或出错 +## det-yolov4-tmi - - 采用国内源,如在docker file 中添加如下命令 +- yolov4 training, mining and infer docker image, use `mxnet` and `darknet` framework - ``` - RUN sed -i 's/archive.ubuntu.com/mirrors.tuna.tsinghua.edu.cn/g' /etc/apt/sources.list + ``` + cd det-yolov4-tmi + docker build -t ymir-executor/yolov4:cuda101-tmi -f cuda101.dockerfile . - RUN pip config set global.index-url https://mirrors.aliyun.com/pypi/simple - ``` + docker build -t ymir-executor/yolov4:cuda112-tmi -f cuda112.dockerfile . + ``` -- docker build 的时候出错,找不到相应docker file或`COPY/ADD`时出错 - - - 回到项目根目录或docker file对应根目录,确保docker file 中`COPY/ADD`的文件与文件夹能够访问,以yolov5为例. - - ``` - cd ymir-executor - - docker build -t ymir-executor/yolov5 . -f det-yolov5-tmi/cuda111.dockerfile - ``` - -- 镜像运行完`/in`与`/out`目录中的文件被清理 +## det-yolov5-tmi - - ymir系统为节省空间,会在任务`成功结束`后删除其中不必要的文件,如果不想删除,可以在部署ymir时,修改文件`ymir/command/mir/tools/command_run_in_out.py`,注释其中的`_cleanup(work_dir=work_dir)`。注意需要重新构建后端镜像 +- yolov5 training, mining and infer docker image, use `pytorch` framework - ``` - cd ymir - docker build -t industryessentials/ymir-backend --build-arg PIP_SOURCE=https://pypi.mirrors.ustc.edu.cn/simple --build-arg SERVER_MODE='dev' -f Dockerfile.backend . +``` +cd det-yolov5-tmi +docker build -t ymir-executor/yolov5:cuda102-tmi -f cuda102.dockerfile . - docker-compose down -v && docker-compose up -d - ``` +docker build -t ymir-executor/yolov5:cuda111-tmi -f cuda111.dockerfile . +``` -- 训练镜像如何调试 +## det-mmdetection-tmi - - 先通过失败任务的tensorboard链接拿到任务id,如`t000000100000175245d1656933456` +``` +cd det-mmdetection-tmi +docker build -t ymir-executor/mmdet:cu102-tmi -f docker/Dockerfile.cuda102 . - - 进入ymir部署目录 `ymir-workplace/sandbox/work_dir/TaskTypeTraining/t000000100000175245d1656933456/sub_task/t000000100000175245d1656933456`, `ls` 可以看到以下结果 +docker build -t ymir-executor/mmdet:cu111-tmi -f docker/Dockerfile.cuda111 . +``` - ``` - # ls - in out task_config.yaml - ``` +## how to custom ymir-executor - - 挂载目录并运行镜像``,注意需要将ymir部署目录挂载到镜像中 +- [demo ymir-executor](det-demo-tmi/README.md) from zero to one, build you ymir-executor - ``` - docker run -it --gpus all -v $PWD/in:/in -v $PWD/out:/out -v : bash +- [custom ymir-executor](https://github.com/IndustryEssentials/ymir/blob/dev/dev_docs/ymir-dataset-zh-CN.md) - # 以/home/ymir/ymir-workplace作为ymir部署目录为例 - docker run -it --gpus all -v $PWD/in:/in -v $PWD/out:/out -v /home/ymir/ymir-workplace:/home/ymir/ymir-workplace bash - ``` +- [ymir-executor-sdk](https://github.com/modelai/ymir-executor-sdk) ymir-executor development SDK. - - 推理与挖掘镜像调试同理,注意对应目录均为`ymir-workplace/sandbox/work_dir/TaskTypeMining` +- [ymir-executor-verifer](https://github.com/modelai/ymir-executor-verifier) debug and check your ymir-executor -- 模型精度/速度如何权衡与提升 +## how to import pretrained model weights - - 模型精度与数据集大小、数据集质量、学习率、batch size、 迭代次数、模型结构、数据增强方式、损失函数等相关,在此不做展开,详情参考: +- [import and finetune model](https://github.com/modelai/ymir-executor-fork/wiki/import-and-finetune-model) - - [Object Detection in 20 Years: A Survey](https://arxiv.org/abs/1905.05055) +- ~~[import pretainted model weights](https://github.com/IndustryEssentials/ymir/blob/dev/dev_docs/import-extra-models.md)~~ - - [Paper with Code: Object Detection](https://paperswithcode.com/task/object-detection) +## reference - - [awesome object detection](https://github.com/amusi/awesome-object-detection) +- [mining algorithm: CALD](https://github.com/we1pingyu/CALD/) +- [mining algorithm: ALDD](https://gitlab.com/haghdam/deep_active_learning) +- [yolov4](https://github.com/AlexeyAB/darknet) +- [yolov5](https://github.com/ultralytics/yolov5) +- [mmdetection](https://github.com/open-mmlab/mmdetection) +- [yolov7](https://github.com/wongkinyiu/yolov7) +- [detectron2](https://github.com/facebookresearch/detectron2) +- [vidt](https://github.com/naver-ai/vidt) +- [nanodet](https://github.com/RangiLyu/nanodet) diff --git a/README_zh-CN.MD b/README_zh-CN.MD new file mode 100644 index 0000000..3579823 --- /dev/null +++ b/README_zh-CN.MD @@ -0,0 +1,247 @@ +# ymir-executor 使用文档 [English](./README.MD) | [简体中文](./README_zh-CN.MD) + +- [ymir](https://github.com/IndustryEssentials/ymir) + +- [说明文档](https://github.com/modelai/ymir-executor-fork/wiki) + +- [ymir镜像](./docs/official-docker-image.md) + +- [ymir 挖掘算法](./docs/mining-images-overview.md) + +## 比较 + +| docker image | [finetune](https://github.com/modelai/ymir-executor-fork/wiki/use-yolov5-to-finetune-or-training-model) | tensorboard | args/cfg options | framework | onnx | pretrained weight | +| - | - | - | - | - | - | - | +| yolov4 | ? | ✔️ | ❌ | darknet + mxnet | ❌ | local | +| yolov5 | ✔️ | ✔️ | ✔️ | pytorch | ✔️ | local+online | +| yolov7 | ✔️ | ✔️ | ✔️ | pytorch | ❌ | local+online | +| mmdetection | ✔️ | ✔️ | ✔️ | pytorch | ❌ | local+online | +| detectron2 | ✔️ | ✔️ | ✔️ | pytorch | ❌ | online | +| vidt | ? | ✔️ | ✔️ | pytorch | ❌ | online | +| nanodet | ✔️ | ✔️ | ❌ | pytorch_lightning | ❌ | local+online | + +- `online` 预训练权重可能在训练时通过网络下载 + +- `local` 预训练权重在构建镜像时复制到了镜像 + +### benchmark + +- 训练集: voc2012-train 5717 images +- 测试集: voc2012-val 5823 images +- 图像大小: 640 (nanodet为416, yolov4为608) + +**由于 coco 数据集包含 voc 数据集中的类, 因此这个对比并不公平, 仅供参考** + +gpu: single Tesla P4 + +| docker image | batch size | epoch number | model | voc2012 val map50 | training time | note | +| - | - | - | - | - | - | - | +| yolov5 | 16 | 100 | yolov5s | 70.05% | 9h | coco-pretrained | +| vidt | 2 | 100 | swin-nano | 54.13% | 2d | imagenet-pretrained | +| yolov4 | 4 | 20000 steps | yolov4 | 66.18% | 2d | imagenet-pretrained | +| yolov7 | 16 | 100 | yolov7-tiny | 70% | 8h | coco-pretrained | + +gpu: single GeForce GTX 1080 Ti + +| docker image | image size | batch size | epoch number | model | voc2012 val map50 | training time | note | +| - | - | - | - | - | - | - | - | +| yolov4 | 608 | 64/32 | 20000 steps | yolov4 | 72.73% | 6h | imagenet-pretrained | +| yolov5 | 640 | 16 | 100 | yolov5s | 70.35% | 2h | coco-pretrained | +| yolov7 | 640 | 16 | 100 | yolov7-tiny | 70.4% | 5h | coco-pretrained | +| mmdetection | 640 | 16 | 100 | yolox_tiny | 66.2% | 5h | coco-pretrained | +| detectron2 | 640 | 2 | 20000 steps | retinanet_R_50_FPN_1x | 53.54% | 2h | imagenet-pretrained | +| nanodet | 416 | 16 | 100 | nanodet-plus-m_416 | 58.63% | 5h | imagenet-pretrained | + +--- + +# 手动构建ymir镜像 + +## det-yolov4-tmi + +- yolov4的训练、挖掘与推理镜像,采用mxnet与darknet框架 + + ``` + cd det-yolov4-tmi + docker build -t ymir-executor/yolov4:cuda101-tmi -f cuda101.dockerfile . + + docker build -t ymir-executor/yolov4:cuda112-tmi -f cuda112.dockerfile . + ``` + +## det-yolov5-tmi + +- yolov5训练、挖掘及推理镜像,采用pytorch框架,镜像构建时会从github上下载权重, 如果访问github不稳定, 建议提前将模型权重下载并在构建时复制到镜像中. + +``` +cd det-yolov5-tmi +docker build -t ymir-executor/yolov5:cuda102-tmi -f cuda102.dockerfile . + +docker build -t ymir-executor/yolov5:cuda111-tmi -f cuda111.dockerfile . +``` + +## det-mmdetection-tmi + +``` +cd det-mmdetection-tmi +docker build -t ymir-executor/mmdet:cu102-tmi -f docker/Dockerfile.cuda102 . + +docker build -t ymir-executor/mmdet:cu111-tmi -f docker/Dockerfile.cuda111 . +``` + +## live-code-executor + +- 可以通过`git_url`, `commit id` 或 `tag` 从网上clone代码到镜像并运行, 不推荐使用`branch`, 因为这样拉取的代码可能随时间变化, 过程不具备可重复性. + +- 参考 [live-code](https://github.com/IndustryEssentials/ymir-remote-git) + +``` +cd live-code-executor + +docker build -t ymir-executor/live-code:torch-tmi -f torch.dockerfile + +docker build -t ymir-executor/live-code:mxnet-tmi -f mxnet.dockerfile +``` + +## 如何制作自己的ymir-executor + +- [示例 ymir-executor](det-demo-tmi/README.md) 从零到一,搭建自己的 ymir-executor + +- [ymir-executor 制作指南](https://github.com/IndustryEssentials/ymir/blob/dev/dev_docs/ymir-dataset-zh-CN.md) + +- [ymir-executor-sdk](https://github.com/modelai/ymir-executor-sdk) ymir镜像开发辅助库 + +- [ymir-executor-verifer](https://github.com/modelai/ymir-executor-verifier) 调试与检测 ymir-executor + +## 如何导入预训练模型 + +- [如何导入并精调外部模型](https://github.com/modelai/ymir-executor-fork/wiki/import-and-finetune-model) + +- ~~[如何导入外部模型](https://github.com/IndustryEssentials/ymir/blob/dev/dev_docs/import-extra-models.md)~~ + + - 通过ymir网页端的 `模型管理/模型列表/导入模型` 同样可以导入模型 + +## 参考 + +- [挖掘算法CALD](https://github.com/we1pingyu/CALD/) +- [挖掘算法ALDD](https://gitlab.com/haghdam/deep_active_learning) +- [yolov4](https://github.com/AlexeyAB/darknet) +- [yolov5](https://github.com/ultralytics/yolov5) +- [mmdetection](https://github.com/open-mmlab/mmdetection) +- [yolov7](https://github.com/wongkinyiu/yolov7) +- [detectron2](https://github.com/facebookresearch/detectron2) +- [vidt](https://github.com/naver-ai/vidt) +- [nanodet](https://github.com/RangiLyu/nanodet) + +--- + +# FAQ + +## 关于cuda版本 + +- 推荐主机安装高版本驱动,支持11.2以上的cuda版本, 使用11.1及以上的镜像 + +- GTX3080/GTX3090不支持11.1以下的cuda,只能使用cuda11.1及以上的镜像 + +## apt 或 pip 安装慢或出错 + +- 采用国内源,如在docker file 中添加如下命令 + + ``` + RUN sed -i 's/archive.ubuntu.com/mirrors.tuna.tsinghua.edu.cn/g' /etc/apt/sources.list + + RUN pip config set global.index-url https://mirrors.aliyun.com/pypi/simple + ``` + +## docker build 的时候出错,找不到相应docker file或`COPY/ADD`时出错 + +- 回到项目根目录或docker file对应根目录,确保docker file 中`COPY/ADD`的文件与文件夹能够访问,以yolov5为例. + + ``` + cd ymir-executor/det-yolov5-tmi + + docker build -t ymir-executor/yolov5:cuda111 . -f cuda111.dockerfile + ``` + +## 镜像运行完`/in`与`/out`目录中的文件被清理 + +- ymir系统为节省空间,会在任务`成功结束`后删除其中不必要的文件,如果不想删除,可以在部署ymir后,修改镜像`industryessentials/ymir-backend`中的`/usr/local/lib/python3.8/dist-packages/mir/tools/command_run_in_out.py`,注释其中所有的`_cleanup(work_dir=work_dir)`, 将修改覆盖到镜像`industryessentials/ymir-backend:latest`并重启ymir + + ``` + $ docker ps |grep backend + + 580c2f1dae1b industryessentials/ymir-backend ... + 5490c294982f industryessentials/ymir-backend-redis ... + + $ docker run -it --rm industryessentials/ymir-backend:latest bash + $ vim /usr/local/lib/python3.8/dist-packages/mir/tools/command_run_in_out.py + ``` + 注释所有的`_cleanup(work_dir=work_dir)`之后,不要立即退出容器,切换到另一个终端 + ``` + $ docker ps |grep backend + + dced73e51429 industryessentials/ymir-backend # use the latest one + 580c2f1dae1b industryessentials/ymir-backend ... + 5490c294982f industryessentials/ymir-backend-redis ... + + $ docker commit dced73e51429 industryessentials/ymir-backend:latest + ``` + 保存改动后,再切换回之前的终端,退出容器,重启ymir即可 + + +## 训练镜像如何调试 + +- 一般性的错误在`ymir-workplace/ymir-data/logs`下查看 + +``` +tail -f -n 100 ymir_controller.log +tail -f -n 100 ymir_app.log +``` + +![](./debug.png) + +- 先修改镜像`industryessentials/ymir-backend`,注释其中所有的`_cleanup(work_dir=work_dir)`,保存`/in`和`/out`目录下的文件 + +- 再通过失败任务的tensorboard链接拿到任务id,如`t000000100000175245d1656933456` + +- 进入ymir部署目录 `ymir-workplace/sandbox/work_dir/TaskTypeTraining/t000000100000175245d1656933456/sub_task/t000000100000175245d1656933456`, `ls` 可以看到以下结果 + + ``` + # ls + in out task_config.yaml + + # ls out + monitor.txt ymir-executor-out.log + + # ls in + assets config.yaml env.yaml ... + ``` + +- 挂载目录并运行镜像``,注意需要将ymir部署目录挂载到镜像中 + + ``` + docker run -it --gpus all --shm-size 12G -v $PWD/in:/in -v $PWD/out:/out -v : -v /sandbox//training_assset_cache:/in/assets bash + + # 以/home/ymir/ymir-workplace作为ymir部署目录为例, 以实际情况为准 + docker run -it --gpus all --shm-size 12G -v $PWD/in:/in -v /home/ymir/ymir-workplace/sandbox/0001/training_assset_cache:/in/assets -v $PWD/out:/out -v /home/ymir/ymir-workplace:/home/ymir/ymir-workplace bash + ``` + +- 进入到docker 容器中后, 执行镜像默认的命令, 如dockerfile中写的 `CMD bash /usr/bin/start.sh` + + ``` + bash /usr/bin/start.sh + ``` + +- 推理与挖掘镜像调试同理,注意对应目录均为`ymir-workplace/sandbox/work_dir/TaskTypeMining` + +## 模型精度/速度如何权衡与提升 + +- 模型精度与数据集大小、数据集质量、学习率、batch size、 迭代次数、模型结构、数据增强方式、损失函数等相关,在此不做展开,详情参考: + + - [Object Detection in 20 Years: A Survey](https://arxiv.org/abs/1905.05055) + + - [Paper with Code: Object Detection](https://paperswithcode.com/task/object-detection) + + - [awesome object detection](https://github.com/amusi/awesome-object-detection) + + - [voc2012 object detection leadboard](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4) + + - [coco object detection leadboard](https://cocodataset.org/#detection-leaderboard) diff --git a/debug.png b/debug.png new file mode 100644 index 0000000..e439ca6 Binary files /dev/null and b/debug.png differ diff --git a/det-demo-tmi/Dockerfile b/det-demo-tmi/Dockerfile new file mode 100644 index 0000000..9a742a9 --- /dev/null +++ b/det-demo-tmi/Dockerfile @@ -0,0 +1,25 @@ +# a docker file for an sample training / mining / infer executor + +FROM python:3.8.13-alpine + +# Add bash +RUN apk add bash +# Required to build numpy wheel +RUN apk add g++ + +COPY requirements.txt ./ +RUN pip3 install -r requirements.txt + +WORKDIR /app +# copy user code to WORKDIR +COPY ./app/start.py /app/ + +# copy user config template to /img-man +RUN mkdir -p /img-man +COPY img-man/*-template.yaml /img-man/ + +# entry point for your app +# the whole docker image will be started with `nvidia-docker run ` +# and this command will run automatically +RUN echo "python /app/start.py" > /usr/bin/start.sh +CMD bash /usr/bin/start.sh diff --git a/det-demo-tmi/README.md b/det-demo-tmi/README.md new file mode 100644 index 0000000..abccece --- /dev/null +++ b/det-demo-tmi/README.md @@ -0,0 +1,274 @@ +# ymir 用户自定义镜像制作指南 + +## 目的 + +此文档面向以下人员: + +* 为 ymir 开发训练,挖掘及推理镜像的算法人员及工程人员 + +* 希望将已经有的训练,挖掘及推理镜像对接到 ymir 系统的算法及工程人员 + +此文档将详细描述如何使用 ymir executor framework 开发新的镜像。 + +![](../docs/ymir-docker-develop.drawio.png) + +## 准备工作 + +1. 下载 ymir 工程 并构建自己的demo镜像: + +``` +git clone https://github.com/modelai/ymir-executor-fork -b ymir-dev +cd ymir-executor-fork/det-demo-tmi + +docker build -t ymir/executor:det-demo-tmi . +``` + +2. 下载voc dog 数据集 + +``` +sudo apt install wget unzip + +wget https://github.com/modelai/ymir-executor-fork/releases/download/dataset/voc_dog_debug_sample.zip -O voc_dog_debug_sample.zip + +unzip voc_dog_debug_sample.zip +``` +运行上述脚本将得到如下目录 +``` +voc_dog +├── in # 输入目录 +│ ├── annotations # 标注文件目录 +│ ├── assets # 图像文件目录 +│ ├── train-index.tsv # 训练集索引文件 +│ └── val-index.tsv # 验证集索引文件 +└── out # 输出目录 +``` + +3. 配置 `/in/env.yaml` 与 `/in/config.yaml` + + * 示例 `voc_dog/in/env.yaml` + + * protocol_version: ymir1.3.0之后添加的字段,说明ymir接口版本 + + ``` + task_id: task0 + protocol_version: 1.0.0 + run_training: True + run_mining: False + run_infer: False + input: + root_dir: /in + assets_dir: /in/assets + annotations_dir: /in/annotations + models_dir: /in/models + training_index_file: /in/train-index.tsv + val_index_file: /in/val-index.tsv + candidate_index_file: /in/candidate-index.tsv + config_file: /in/config.yaml + output: + root_dir: /out + models_dir: /out/models + tensorboard_dir: /out/tensorboard + training_result_file: /out/models/result.yaml + mining_result_file: /out/result.tsv + infer_result_file: /out/infer-result.json + monitor_file: /out/monitor.txt + executor_log_file: /out/ymir-executor-out.log + ``` + + * 示例 `voc_dog/in/config.yaml` + ``` + class_names: + - dog + export_format: ark:raw + gpu_count: 1 + # gpu_id: '0,1,2,3' + gpu_id: '0' + pretrained_model_params: [] + shm_size: 128G + task_id: t00000020000020167c11661328921 + + # just for test, remove this key in your own docker image + expected_map: 0.983 # expected map for training task + idle_seconds: 60 # idle seconds for each task + ``` + +4. 运行测试镜像 +``` +# 交互式运行 +docker run -it --rm -v $PWD/voc_dog/in:/in -v $PWD/voc_dog/out:/out ymir/executor:det-demo-tmi bash +> bash /usr/bin/start.sh + +# 直接运行 +docker run --rm -v $PWD/voc_dog/in:/in -v $PWD/voc_dog/out:/out ymir/executor:det-demo-tmi +``` + +## ymir 对镜像的调用流程 + +ymir 通过 mir train / mir mining / mir infer 命令启动镜像,遵循以下步骤: + +1. 导出镜像需要用的图像资源以及标注资源文件 + +2. 准备镜像配置 config.yaml 及 env.yaml + +3. 通过 nvidia-docker run 激活镜像,在启动镜像时,将提供以下目录及文件: + +| 目录或文件 | 说明 | 权限 | +| --- | --- | --- | +| `/in/env.yaml` | 任务类型,任务 id,数据集索引文件位置等信息 | 只读 | +| `/in/config.yaml` | 镜像本身所用到的超参等标注信息 | 只读 | +| `/in/*-index.tsv` | 数据集索引文件 | 只读 | +| `/in/models` | 预训练模型存放目录 | 只读 | +| `/in/assets` | 图像资源存放目录 | 只读 | +| `/in/annotations` | 标注文件存放目录 | 只读 | +| `/out/tensorboard` | tensorboard 日志写入目录 | 读写 | +| `/out/models` | 结果模型保存目录 | 读写 | + +4. 镜像启动以后,完成自己的训练、挖掘或推理任务,将相应结果写入对应文件,若成功,则返回 0,若失败,则返回非 0 错误码 + +5. ymir 将正确结果或异常结果归档,完成整个过程 + +## 训练、挖掘与推理镜像的开发工具包 ymir_exc + +`app/start.py` 展示了一个简单的镜像执行部分,此文档也将基于这个样例工程来说明如何使用`ymir_exc`来开发镜像。 + +关于这个文件,有以下部分值得注意: + +1. 在 Dockerfile 中,最后一条命令说明了:当此镜像被 ymir 系统通过 nvidia-docker run 启动时,默认执行的是 `bash /usr/bin/start.sh`, 即调用 `python /app/start.py` 命令,也就是此工程中的 `app/start.py` 文件 + +2. 镜像框架相关的所有内容都在 `ymir_exc` 包中,包括以下部分: + + 安装方式 `pip install "git+https://github.com/modelai/ymir-executor-sdk.git@ymir1.3.0"`, 注意通过 ~~`pip install ymir_exc`~~ 的方式安装的版本不具有 `ymir_exc.util` 包。前者在后者的代码基础上进行了扩展,提供了更多的功能(如 `ymir_exc.util`)。 + + * `env`:环境,提供任务类型,任务 id 等信息 + + * `dataset_reader`:使用数据集读取器来取得数据集信息 + + * `result_writer`:写入训练,挖掘以及推理结果 + + * `monitor`:写入进度信息 + + * `util`: 常用函数, 如`get_merged_config()` + +3. 使用 `cfg=util.get_merged_config()` 可以取得默认的 `EasyDict` 实例,这个实例的`cfg.ymir`来源于文件 `/in/env.yaml`,如果出于测试的目的想要更改这个默认文件,可以直接更改 `settings.DEFAULT_ENV_FILE_PATH`,但在实际封装成镜像的时候,应该把它的值重新指回成默认的 `/in/env.yaml`. `cfg.param`则来源于`/in/config.yaml` + +4. 在 `start()` 方法中,通过 `cfg.ymir` 中的 `run_training` / `run_mining` / `run_infer` 来判断本次需要执行的任务类型。如果任务类型是本镜像不支持的,可以直接报错 + +5. 虽然 `app/start.py` 展示的是一个训练,挖掘和推理多合一的镜像,开发者也可以分成若干个独立的镜像,例如,训练一个,挖掘和推理合成一个。实际应用中,镜像可以同时运行推理和挖掘这两个任务,注意其进度与单独运行时不同。 + + * 单独运行时,推理或者挖掘的进度值 `percent` 在 [0, 1] 区间,并通过 `monitor.write_monitor_logger(percent)` 记录在 `/out/monitor.txt` 中。 + + * 同时运行时, 假设先进行挖掘任务, 那么挖掘的进度值在 [0, 0.5] 区间,推理的进度度值在 [0.5, 1] 区间。 + +## 训练过程 + +`app/start.py` 中的函数 `_run_training` 展示了一个训练功能的样例,有以下部分需要注意: + +1. 超参的取得 + + * 使用 `cfg.param` 取得外部传入的超参数等信息 + + * 每个训练镜像都应该准备一个超参模板 `training-template.yaml`,ymir 系统将以此模板为基础提供超参 + + * 以下 key 为保留字,将由系统指定: + +| key | 类型 | 说明 | +| --- | --- | --- | +| class_names | list | 类别 | +| gpu_id | str | 可使用的 gpu id,以英文逗号分隔,如果为空,则表示用 cpu 训练 | +| pretrained_model_params | list | 预训练模型列表,如果指定了,则表示需要基于此模型做继续训练 | + +2. 训练集和验证集的取得:使用 `cfg.ymir.input.training_index_file` 和 `cfg.ymir.input.val_index_file` 取得训练集和验证集的索引文件。索引文件中每一行为图像绝对路径与标注绝对路径,以`\t`进行分隔。 +``` +from ymir_exc.util import get_merged_config + +cfg = get_merged_config() +with open(cfg.ymir.input.training_index_file, 'r') as fp: + lines = fp.readlines() + +for idx, line in enumerate(lines): + image_path, annotation_path = line.strip().split() + ... +``` + +3. 模型的保存 + + * 模型按当前正在进行的 stage name,分目录保存 + + * 在 `cfg.ymir.output.models_dir` 中提供了模型的保存目录,用户可以使用 pytorch, mxnet, darknet 等训练框架自带的保存方法将模型保存在此目录下的以当前 stage_name 命名的子目录中 + + * 例如,如果需要保存 stage_name 为 'epoch-5000' 的模型,则需要把这些模型文件保存到 `os.path.join(cfg.ymir.output.model_dir, 'epoch-5000')` 目录下 + + * 推荐使用 `util.write_ymir_training_result()` 方法保存训练结果 (不带目录的模型名称列表,mAP等) ,它对 `result_writer.write_model_stage()` 进行了封装,兼容性与容错性更好。 + + * 需要保存的模型实际记录在`cfg.ymir.output.training_result_file`中,ymir将依据此文件进行文件打包,供用户下载、迭代训练及推理挖掘。 + +4. 进度的记录:使用 `monitor.write_monitor_logger(percent)` 方法记录任务当前的进度,实际使用时,可以每隔若干轮迭代,根据当前迭代次数和总迭代次数来估算当前进度(一个 0 到 1 之间的数),调用此方法记录 + +## 挖掘过程 + +所谓挖掘过程指的是:提供一个基础模型,以及一个不带标注的候选数据集,在此候选数据集上进行 active learning 算法,得到每张图片的得分,并将这个得分结果保存。 + +`app/start.py` 中的函数 `_run_mining` 展示了一个数据挖掘过程的样例,有以下部分需要注意: + +1. 参数的取得 + + * 使用 `cfg = get_merged_config()` 取得外部传入的参数 `cfg.param` + + * 每个挖掘镜像都应该准备一个参数模板 `mining-template.yaml`,ymir 系统将以此模板为基础提供参数 + + * 以下 key 为保留字,将由系统指定: + +| key | 类型 | 说明 | +| --- | --- | --- | +| class_names | list | 类别 | +| gpu_id | str | 可使用的 gpu id,以英文逗号分隔,如果为空,则表示用 cpu 训练 | +| model_params_path | list | 模型路径列表,镜像应该从里面选择自己可以使用的模型,如果有多个模型可以使用,直接报错 | + +2. 候选集的取得 + + * 进行挖掘任务时,所使用的数据集是一个没有带标注的候选集,可以使用 `cfg.ymir.input.candidate_index_file` 取得挖掘数据集的索引文件,这个文件中每一行为图片的绝对路径。 + + ``` + with open(cfg.ymir.input.candidate_index_file, 'r') as fp: + lines = fp.readlines() + + for line in lines: + image_path = line.strip() + ... + ``` + +3. 结果的保存 + + * 使用 `result_writer.write_mining_result()` 对挖掘结果进行保存, 结果将保存到`cfg.ymir.output.mining_result_file`,ymir将依据这个文件进行新数据集生成。 + +## 推理过程 + +所谓推理过程指的是:提供一个基础模型,以及一个不带标注的候选数据集,在此候选数据集上进行模型推理,得到每张图片的 detection 结果(框,类别,得分),并保存此结果。 + +`app/start.py` 中的函数 `_run_infer` 展示了一个推理过程的样例,有以下部分需要注意: + +1. 参数的取得:同数据挖掘过程 + +2. 候选集的取得:同数据挖掘过程, 也是利用文件 `cfg.ymir.input.candidate_index_file` + +3. 结果的保存 + + * 推理结果本身是一个 dict,key 是候选集图片的路径,value 是一个由 `result_writer.Annotation` 构成的 list + + * 使用 `result_writer.write_infer_result()` 保存推理结果, 推理结果将保存到`cfg.ymir.output.infer_result_file`, ymir将依据这个文件进行结果展示与新数据集生成。 + +## 镜像打包 + +可以在 `Dockerfile` 的基础上构建自己的打包脚本 + +## 测试 + +可以使用以下几种方式进行测试: + +1. 通过 [ymir-executor-verifier](https://github.com/modelai/ymir-executor-verifier) 进行测试 + +2. 通过 ymir web 系统进行测试 + +3. 通过 ymir 命令行启动 mir train / mir mining / mir infer 命令进行测试 + + diff --git a/det-demo-tmi/app/start.py b/det-demo-tmi/app/start.py new file mode 100644 index 0000000..2b8e877 --- /dev/null +++ b/det-demo-tmi/app/start.py @@ -0,0 +1,223 @@ +import logging +import os +import random +import sys +import time +from typing import List + +# view https://github.com/protocolbuffers/protobuf/issues/10051 for detail +os.environ.setdefault('PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION', 'python') +from tensorboardX import SummaryWriter +from easydict import EasyDict as edict +from ymir_exc import monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import get_merged_config + + +def start() -> int: + cfg = get_merged_config() + + if cfg.ymir.run_training: + _run_training(cfg) + if cfg.ymir.run_mining: + _run_mining(cfg) + if cfg.ymir.run_infer: + _run_infer(cfg) + + return 0 + + +def _run_training(cfg: edict) -> None: + """ + sample function of training, which shows: + 1. how to get config file + 2. how to read training and validation datasets + 3. how to write logs + 4. how to write training result + """ + #! use `env.get_executor_config` to get config file for training + gpu_id: str = cfg.param.get(key='gpu_id') + class_names: List[str] = cfg.param.get(key='class_names') + expected_mAP: float = cfg.param.get(key='expected_map', default=0.6) + idle_seconds: float = cfg.param.get(key='idle_seconds', default=60) + trigger_crash: bool = cfg.param.get(key='trigger_crash', default=False) + #! use `logging` or `print` to write log to console + # notice that logging.basicConfig is invoked at executor.env + logging.info(f'gpu device: {gpu_id}') + logging.info(f'dataset class names: {class_names}') + logging.info(f"training config: {cfg.param}") + + #! count for image and annotation file + with open(cfg.ymir.input.training_index_file, 'r') as fp: + lines = fp.readlines() + + valid_image_count = 0 + valid_ann_count = 0 + + N = len(lines) + monitor_gap = max(1, N // 100) + for idx, line in enumerate(lines): + asset_path, annotation_path = line.strip().split() + if os.path.isfile(asset_path): + valid_image_count += 1 + + if os.path.isfile(annotation_path): + valid_ann_count += 1 + + #! use `monitor.write_monitor_logger` to write write task process percent to monitor.txt + if idx % monitor_gap == 0: + monitor.write_monitor_logger(percent=0.2 * idx / N) + + logging.info(f'total image-ann pair: {N}') + logging.info(f'valid images: {valid_image_count}') + logging.info(f'valid annotations: {valid_ann_count}') + + #! use `monitor.write_monitor_logger` to write write task process percent to monitor.txt + monitor.write_monitor_logger(percent=0.2) + + # suppose we have a long time training, and have saved the final model + #! model output dir: os.path.join(cfg.ymir.output.models_dir, your_stage_name) + stage_dir = os.path.join(cfg.ymir.output.models_dir, 'epoch10') + os.makedirs(stage_dir, exist_ok=True) + with open(os.path.join(stage_dir, 'epoch10.pt'), 'w') as f: + f.write('fake model weight') + with open(os.path.join(stage_dir, 'config.py'), 'w') as f: + f.write('fake model config file') + #! use `rw.write_model_stage` to save training result + rw.write_model_stage(stage_name='epoch10', files=['epoch10.pt', 'config.py'], mAP=random.random() / 2) + + _dummy_work(idle_seconds=idle_seconds, trigger_crash=trigger_crash) + + write_tensorboard_log(cfg.ymir.output.tensorboard_dir) + + stage_dir = os.path.join(cfg.ymir.output.models_dir, 'epoch20') + os.makedirs(stage_dir, exist_ok=True) + with open(os.path.join(stage_dir, 'epoch20.pt'), 'w') as f: + f.write('fake model weight') + with open(os.path.join(stage_dir, 'config.py'), 'w') as f: + f.write('fake model config file') + rw.write_model_stage(stage_name='epoch20', files=['epoch20.pt', 'config.py'], mAP=expected_mAP) + + #! if task done, write 100% percent log + logging.info('training done') + monitor.write_monitor_logger(percent=1.0) + + +def _run_mining(cfg: edict) -> None: + #! use `cfg.param` to get config file for training + # pretrained models in `cfg.ymir.input.models_dir` + gpu_id: str = cfg.param.get(key='gpu_id') + class_names: List[str] = cfg.param.get(key='class_names') + idle_seconds: float = cfg.param.get('idle_seconds', 60) + trigger_crash: bool = cfg.param.get('trigger_crash', False) + #! use `logging` or `print` to write log to console + logging.info(f"mining config: {cfg.param}") + logging.info(f'gpu device: {gpu_id}') + logging.info(f'dataset class names: {class_names}') + + #! use `cfg.input.candidate_index_file` to read candidate dataset items + # note that annotations path will be empty str if there's no annotations in that dataset + #! count for image files + with open(cfg.ymir.input.candidate_index_file, 'r') as fp: + lines = fp.readlines() + + valid_images = [] + valid_image_count = 0 + for line in lines: + if os.path.isfile(line.strip()): + valid_image_count += 1 + valid_images.append(line.strip()) + + #! use `monitor.write_monitor_logger` to write task process to monitor.txt + logging.info(f"assets count: {len(lines)}, valid: {valid_image_count}") + monitor.write_monitor_logger(percent=0.2) + + _dummy_work(idle_seconds=idle_seconds, trigger_crash=trigger_crash) + + #! write mining result + # here we give a fake score to each assets + total_length = len(valid_images) + mining_result = [(asset_path, index / total_length) for index, asset_path in enumerate(valid_images)] + rw.write_mining_result(mining_result=mining_result) + + #! if task done, write 100% percent log + logging.info('mining done') + monitor.write_monitor_logger(percent=1.0) + + +def _run_infer(cfg: edict) -> None: + #! use `cfg.param` to get config file for training + # models are transfered in `cfg.ymir.input.models_dir` model_params_path + class_names = cfg.param.get('class_names') + idle_seconds: float = cfg.param.get('idle_seconds', 60) + trigger_crash: bool = cfg.param.get('trigger_crash', False) + seed: int = cfg.param.get('seed', 15) + #! use `logging` or `print` to write log to console + logging.info(f"infer config: {cfg.param}") + + #! use `cfg.ymir.input.candidate_index_file` to read candidate dataset items + # note that annotations path will be empty str if there's no annotations in that dataset + with open(cfg.ymir.input.candidate_index_file, 'r') as fp: + lines = fp.readlines() + + valid_images = [] + invalid_images = [] + valid_image_count = 0 + for line in lines: + if os.path.isfile(line.strip()): + valid_image_count += 1 + valid_images.append(line.strip()) + else: + invalid_images.append(line.strip()) + + #! use `monitor.write_monitor_logger` to write log to console and write task process percent to monitor.txt + logging.info(f"assets count: {len(lines)}, valid: {valid_image_count}") + monitor.write_monitor_logger(percent=0.2) + + _dummy_work(idle_seconds=idle_seconds, trigger_crash=trigger_crash) + + #! write infer result + fake_anns = [] + random.seed(seed) + for class_name in class_names: + x = random.randint(0, 100) + y = random.randint(0, 100) + w = random.randint(50, 100) + h = random.randint(50, 100) + ann = rw.Annotation(class_name=class_name, score=random.random(), box=rw.Box(x=x, y=y, w=w, h=h)) + + fake_anns.append(ann) + + infer_result = {asset_path: fake_anns for asset_path in valid_images} + for asset_path in invalid_images: + infer_result[asset_path] = [] + rw.write_infer_result(infer_result=infer_result) + + #! if task done, write 100% percent log + logging.info('infer done') + monitor.write_monitor_logger(percent=1.0) + + +def _dummy_work(idle_seconds: float, trigger_crash: bool = False, gpu_memory_size: int = 0) -> None: + if idle_seconds > 0: + time.sleep(idle_seconds) + if trigger_crash: + raise RuntimeError('app crashed') + + +def write_tensorboard_log(tensorboard_dir: str) -> None: + tb_log = SummaryWriter(tensorboard_dir) + + total_epoch = 30 + for e in range(total_epoch): + tb_log.add_scalar("fake_loss", 10 / (1 + e), e) + time.sleep(1) + monitor.write_monitor_logger(percent=e / total_epoch) + + +if __name__ == '__main__': + logging.basicConfig(stream=sys.stdout, + format='%(levelname)-8s: [%(asctime)s] %(message)s', + datefmt='%Y%m%d-%H:%M:%S', + level=logging.INFO) + sys.exit(start()) diff --git a/det-demo-tmi/img-man/infer-template.yaml b/det-demo-tmi/img-man/infer-template.yaml new file mode 100644 index 0000000..b3d45dd --- /dev/null +++ b/det-demo-tmi/img-man/infer-template.yaml @@ -0,0 +1,12 @@ +# infer template for your executor app +# after build image, it should at /img-man/infer-template.yaml +# key: gpu_id, task_id, model_params_path, class_names should be preserved + +gpu_id: '0' +task_id: 'default-infer-task' +model_params_path: [] +class_names: [] + +# just for test, remove this key in your own docker image +idle_seconds: 3 # idle seconds for each task +seed: 15 diff --git a/det-demo-tmi/img-man/mining-template.yaml b/det-demo-tmi/img-man/mining-template.yaml new file mode 100644 index 0000000..5927eca --- /dev/null +++ b/det-demo-tmi/img-man/mining-template.yaml @@ -0,0 +1,11 @@ +# mining template for your executor app +# after build image, it should at /img-man/mining-template.yaml +# key: gpu_id, task_id, model_params_path, class_names should be preserved + +gpu_id: '0' +task_id: 'default-mining-task' +model_params_path: [] +class_names: [] + +# just for test, remove this key in your own docker image +idle_seconds: 6 # idle seconds for each task diff --git a/det-demo-tmi/img-man/training-template.yaml b/det-demo-tmi/img-man/training-template.yaml new file mode 100644 index 0000000..f114648 --- /dev/null +++ b/det-demo-tmi/img-man/training-template.yaml @@ -0,0 +1,13 @@ +# training template for your executor app +# after build image, it should at /img-man/training-template.yaml +# key: gpu_id, task_id, pretrained_model_paths, class_names should be preserved + +gpu_id: '0' +task_id: 'default-training-task' +pretrained_model_params: [] +class_names: [] +export_format: 'det-voc:raw' + +# just for test, remove this key in your own docker image +expected_map: 0.983 # expected map for training task +idle_seconds: 60 # idle seconds for each task diff --git a/det-demo-tmi/requirements.txt b/det-demo-tmi/requirements.txt new file mode 100644 index 0000000..0517cf4 --- /dev/null +++ b/det-demo-tmi/requirements.txt @@ -0,0 +1,4 @@ +pydantic>=1.8.2 +pyyaml>=5.4.1 +tensorboardX>=2.4 +-e "git+https://github.com/modelai/ymir-executor-sdk.git@ymir1.3.0" diff --git a/det-mmdetection-tmi/README.md b/det-mmdetection-tmi/README.md index c1d63cc..f1c0ab6 100644 --- a/det-mmdetection-tmi/README.md +++ b/det-mmdetection-tmi/README.md @@ -1,329 +1,34 @@ -
- -
 
-
- OpenMMLab website - - - HOT - - -      - OpenMMLab platform - - - TRY IT OUT - - -
-
 
+# det-mmdetection-tmi -[![PyPI](https://img.shields.io/pypi/v/mmdet)](https://pypi.org/project/mmdet) -[![docs](https://img.shields.io/badge/docs-latest-blue)](https://mmdetection.readthedocs.io/en/latest/) -[![badge](https://github.com/open-mmlab/mmdetection/workflows/build/badge.svg)](https://github.com/open-mmlab/mmdetection/actions) -[![codecov](https://codecov.io/gh/open-mmlab/mmdetection/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmdetection) -[![license](https://img.shields.io/github/license/open-mmlab/mmdetection.svg)](https://github.com/open-mmlab/mmdetection/blob/master/LICENSE) -[![open issues](https://isitmaintained.com/badge/open/open-mmlab/mmdetection.svg)](https://github.com/open-mmlab/mmdetection/issues) +- [mmdetection](./README_mmdet.md) - +`mmdetection` framework for object `det`ection `t`raining/`m`ining/`i`nfer task -[📘Documentation](https://mmdetection.readthedocs.io/en/v2.21.0/) | -[🛠️Installation](https://mmdetection.readthedocs.io/en/v2.21.0/get_started.html) | -[👀Model Zoo](https://mmdetection.readthedocs.io/en/v2.21.0/model_zoo.html) | -[🆕Update News](https://mmdetection.readthedocs.io/en/v2.21.0/changelog.html) | -[🚀Ongoing Projects](https://github.com/open-mmlab/mmdetection/projects) | -[🤔Reporting Issues](https://github.com/open-mmlab/mmdetection/issues/new/choose) - -
- -## Introduction - -English | [简体中文](README_zh-CN.md) - -MMDetection is an open source object detection toolbox based on PyTorch. It is -a part of the [OpenMMLab](https://openmmlab.com/) project. - -The master branch works with **PyTorch 1.5+**. - -
-Major features - -- **Modular Design** - - We decompose the detection framework into different components and one can easily construct a customized object detection framework by combining different modules. - -- **Support of multiple frameworks out of box** - - The toolbox directly supports popular and contemporary detection frameworks, *e.g.* Faster RCNN, Mask RCNN, RetinaNet, etc. - -- **High efficiency** - - All basic bbox and mask operations run on GPUs. The training speed is faster than or comparable to other codebases, including [Detectron2](https://github.com/facebookresearch/detectron2), [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) and [SimpleDet](https://github.com/TuSimple/simpledet). - -- **State of the art** - - The toolbox stems from the codebase developed by the *MMDet* team, who won [COCO Detection Challenge](http://cocodataset.org/#detection-leaderboard) in 2018, and we keep pushing it forward. - -
- -Apart from MMDetection, we also released a library [mmcv](https://github.com/open-mmlab/mmcv) for computer vision research, which is heavily depended on by this toolbox. - -## License - -This project is released under the [Apache 2.0 license](LICENSE). - -## Changelog - -**2.22.0** was released in 24/2/2022: - -- Support [MaskFormer](configs/maskformer), [DyHead](configs/dyhead), [OpenImages Dataset](configs/openimages) and [TIMM backbone](configs/timm_example) -- Support visualization for Panoptic Segmentation -- Release a good recipe of using ResNet in object detectors pre-trained by [ResNet Strikes Back](https://arxiv.org/abs/2110.00476), which consistently brings about 3~4 mAP improvements over RetinaNet, Faster/Mask/Cascade Mask R-CNN - -Please refer to [changelog.md](docs/en/changelog.md) for details and release history. - -For compatibility changes between different versions of MMDetection, please refer to [compatibility.md](docs/en/compatibility.md). - -## Overview of Benchmark and Model Zoo - -Results and models are available in the [model zoo](docs/en/model_zoo.md). - -
- Architectures -
- - - - - - - - - - - - - - - - - -
- Object Detection - - Instance Segmentation - - Panoptic Segmentation - - Other -
- - - - - - - -
  • Contrastive Learning
  • - - -
  • Distillation
  • - - -
    - -
    - Components -
    - - - - - - - - - - - - - - - - - -
    - Backbones - - Necks - - Loss - - Common -
    - - - - - - - -
    - -Some other methods are also supported in [projects using MMDetection](./docs/en/projects.md). - -## Installation - -Please refer to [get_started.md](docs/en/get_started.md) for installation. - -## Getting Started - -Please see [get_started.md](docs/en/get_started.md) for the basic usage of MMDetection. -We provide [colab tutorial](demo/MMDet_Tutorial.ipynb), and full guidance for quick run [with existing dataset](docs/en/1_exist_data_model.md) and [with new dataset](docs/en/2_new_data_model.md) for beginners. -There are also tutorials for [finetuning models](docs/en/tutorials/finetune.md), [adding new dataset](docs/en/tutorials/customize_dataset.md), [designing data pipeline](docs/en/tutorials/data_pipeline.md), [customizing models](docs/en/tutorials/customize_models.md), [customizing runtime settings](docs/en/tutorials/customize_runtime.md) and [useful tools](docs/en/useful_tools.md). - -Please refer to [FAQ](docs/en/faq.md) for frequently asked questions. - -## Contributing - -We appreciate all contributions to improve MMDetection. Ongoing projects can be found in out [GitHub Projects](https://github.com/open-mmlab/mmdetection/projects). Welcome community users to participate in these projects. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline. - -## Acknowledgement - -MMDetection is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. -We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new detectors. - -## Citation - -If you use this toolbox or benchmark in your research, please cite this project. +# build docker image ``` -@article{mmdetection, - title = {{MMDetection}: Open MMLab Detection Toolbox and Benchmark}, - author = {Chen, Kai and Wang, Jiaqi and Pang, Jiangmiao and Cao, Yuhang and - Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and - Liu, Ziwei and Xu, Jiarui and Zhang, Zheng and Cheng, Dazhi and - Zhu, Chenchen and Cheng, Tianheng and Zhao, Qijie and Li, Buyu and - Lu, Xin and Zhu, Rui and Wu, Yue and Dai, Jifeng and Wang, Jingdong - and Shi, Jianping and Ouyang, Wanli and Loy, Chen Change and Lin, Dahua}, - journal= {arXiv preprint arXiv:1906.07155}, - year={2019} -} +docker build -t ymir-executor/mmdet:cuda102-tmi --build-arg YMIR=1.1.0 -f docker/Dockerfile.cuda102 . + +docker build -t ymir-executor/mmdet:cuda111-tmi --build-arg YMIR=1.1.0 -f docker/Dockerfile.cuda111 . ``` -## Projects in OpenMMLab -- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab foundational library for computer vision. -- [MIM](https://github.com/open-mmlab/mim): MIM installs OpenMMLab packages. -- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark. -- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark. -- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection. -- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab rotated object detection toolbox and benchmark. -- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark. -- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab text detection, recognition, and understanding toolbox. -- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark. -- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 3D human parametric model toolbox and benchmark. -- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab self-supervised learning toolbox and benchmark. -- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab model compression toolbox and benchmark. -- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab fewshot learning toolbox and benchmark. -- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark. -- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark. -- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark. -- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox. -- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox. -- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab model deployment framework. +# changelog +- modify `mmdet/datasets/coco.py`, save the evaluation result to `os.environ.get('COCO_EVAL_TMP_FILE')` with json format +- modify `mmdet/core/evaluation/eval_hooks.py`, write training result file and monitor task process +- modify `mmdet/datasets/__init__.py, mmdet/datasets/coco.py` and add `mmdet/datasets/ymir.py`, add class `YmirDataset` to load YMIR dataset. +- modify `requirements/runtime.txt` to add new dependent package. +- add `mmdet/utils/util_ymir.py` for ymir training/infer/mining +- add `ymir_infer.py` for infer +- add `ymir_mining.py` for mining +- add `ymir_train.py` modify `tools/train.py` to update the mmcv config for training +- add `start.py`, the entrypoint for docker image +- add `training-template.yaml, infer-template.yaml, mining-template.yaml` for ymir pre-defined hyper-parameters. +- add `docker/Dockerfile.cuda102, docker/Dockerfile.cuda111` to build docker image +- remove `docker/Dockerfile` to avoid misuse + +--- + +- 2022/09/06: set `find_unused_parameters = True`, fix DDP bug +- 2022/10/18: add `random` and `aldd` mining algorithm. `aldd` algorithm support yolox only. +- 2022/10/19: fix training class_number bug in `recursive_modify_attribute()` diff --git a/det-mmdetection-tmi/README_mmdet.md b/det-mmdetection-tmi/README_mmdet.md new file mode 100644 index 0000000..c1d63cc --- /dev/null +++ b/det-mmdetection-tmi/README_mmdet.md @@ -0,0 +1,329 @@ +
    + +
     
    +
    + OpenMMLab website + + + HOT + + +      + OpenMMLab platform + + + TRY IT OUT + + +
    +
     
    + +[![PyPI](https://img.shields.io/pypi/v/mmdet)](https://pypi.org/project/mmdet) +[![docs](https://img.shields.io/badge/docs-latest-blue)](https://mmdetection.readthedocs.io/en/latest/) +[![badge](https://github.com/open-mmlab/mmdetection/workflows/build/badge.svg)](https://github.com/open-mmlab/mmdetection/actions) +[![codecov](https://codecov.io/gh/open-mmlab/mmdetection/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmdetection) +[![license](https://img.shields.io/github/license/open-mmlab/mmdetection.svg)](https://github.com/open-mmlab/mmdetection/blob/master/LICENSE) +[![open issues](https://isitmaintained.com/badge/open/open-mmlab/mmdetection.svg)](https://github.com/open-mmlab/mmdetection/issues) + + + +[📘Documentation](https://mmdetection.readthedocs.io/en/v2.21.0/) | +[🛠️Installation](https://mmdetection.readthedocs.io/en/v2.21.0/get_started.html) | +[👀Model Zoo](https://mmdetection.readthedocs.io/en/v2.21.0/model_zoo.html) | +[🆕Update News](https://mmdetection.readthedocs.io/en/v2.21.0/changelog.html) | +[🚀Ongoing Projects](https://github.com/open-mmlab/mmdetection/projects) | +[🤔Reporting Issues](https://github.com/open-mmlab/mmdetection/issues/new/choose) + +
    + +## Introduction + +English | [简体中文](README_zh-CN.md) + +MMDetection is an open source object detection toolbox based on PyTorch. It is +a part of the [OpenMMLab](https://openmmlab.com/) project. + +The master branch works with **PyTorch 1.5+**. + +
    +Major features + +- **Modular Design** + + We decompose the detection framework into different components and one can easily construct a customized object detection framework by combining different modules. + +- **Support of multiple frameworks out of box** + + The toolbox directly supports popular and contemporary detection frameworks, *e.g.* Faster RCNN, Mask RCNN, RetinaNet, etc. + +- **High efficiency** + + All basic bbox and mask operations run on GPUs. The training speed is faster than or comparable to other codebases, including [Detectron2](https://github.com/facebookresearch/detectron2), [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) and [SimpleDet](https://github.com/TuSimple/simpledet). + +- **State of the art** + + The toolbox stems from the codebase developed by the *MMDet* team, who won [COCO Detection Challenge](http://cocodataset.org/#detection-leaderboard) in 2018, and we keep pushing it forward. + +
    + +Apart from MMDetection, we also released a library [mmcv](https://github.com/open-mmlab/mmcv) for computer vision research, which is heavily depended on by this toolbox. + +## License + +This project is released under the [Apache 2.0 license](LICENSE). + +## Changelog + +**2.22.0** was released in 24/2/2022: + +- Support [MaskFormer](configs/maskformer), [DyHead](configs/dyhead), [OpenImages Dataset](configs/openimages) and [TIMM backbone](configs/timm_example) +- Support visualization for Panoptic Segmentation +- Release a good recipe of using ResNet in object detectors pre-trained by [ResNet Strikes Back](https://arxiv.org/abs/2110.00476), which consistently brings about 3~4 mAP improvements over RetinaNet, Faster/Mask/Cascade Mask R-CNN + +Please refer to [changelog.md](docs/en/changelog.md) for details and release history. + +For compatibility changes between different versions of MMDetection, please refer to [compatibility.md](docs/en/compatibility.md). + +## Overview of Benchmark and Model Zoo + +Results and models are available in the [model zoo](docs/en/model_zoo.md). + +
    + Architectures +
    + + + + + + + + + + + + + + + + + +
    + Object Detection + + Instance Segmentation + + Panoptic Segmentation + + Other +
    + + + + + + + +
  • Contrastive Learning
  • + + +
  • Distillation
  • + + +
    + +
    + Components +
    + + + + + + + + + + + + + + + + + +
    + Backbones + + Necks + + Loss + + Common +
    + + + + + + + +
    + +Some other methods are also supported in [projects using MMDetection](./docs/en/projects.md). + +## Installation + +Please refer to [get_started.md](docs/en/get_started.md) for installation. + +## Getting Started + +Please see [get_started.md](docs/en/get_started.md) for the basic usage of MMDetection. +We provide [colab tutorial](demo/MMDet_Tutorial.ipynb), and full guidance for quick run [with existing dataset](docs/en/1_exist_data_model.md) and [with new dataset](docs/en/2_new_data_model.md) for beginners. +There are also tutorials for [finetuning models](docs/en/tutorials/finetune.md), [adding new dataset](docs/en/tutorials/customize_dataset.md), [designing data pipeline](docs/en/tutorials/data_pipeline.md), [customizing models](docs/en/tutorials/customize_models.md), [customizing runtime settings](docs/en/tutorials/customize_runtime.md) and [useful tools](docs/en/useful_tools.md). + +Please refer to [FAQ](docs/en/faq.md) for frequently asked questions. + +## Contributing + +We appreciate all contributions to improve MMDetection. Ongoing projects can be found in out [GitHub Projects](https://github.com/open-mmlab/mmdetection/projects). Welcome community users to participate in these projects. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline. + +## Acknowledgement + +MMDetection is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. +We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new detectors. + +## Citation + +If you use this toolbox or benchmark in your research, please cite this project. + +``` +@article{mmdetection, + title = {{MMDetection}: Open MMLab Detection Toolbox and Benchmark}, + author = {Chen, Kai and Wang, Jiaqi and Pang, Jiangmiao and Cao, Yuhang and + Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and + Liu, Ziwei and Xu, Jiarui and Zhang, Zheng and Cheng, Dazhi and + Zhu, Chenchen and Cheng, Tianheng and Zhao, Qijie and Li, Buyu and + Lu, Xin and Zhu, Rui and Wu, Yue and Dai, Jifeng and Wang, Jingdong + and Shi, Jianping and Ouyang, Wanli and Loy, Chen Change and Lin, Dahua}, + journal= {arXiv preprint arXiv:1906.07155}, + year={2019} +} +``` + +## Projects in OpenMMLab + +- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab foundational library for computer vision. +- [MIM](https://github.com/open-mmlab/mim): MIM installs OpenMMLab packages. +- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark. +- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark. +- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection. +- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab rotated object detection toolbox and benchmark. +- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark. +- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab text detection, recognition, and understanding toolbox. +- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark. +- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 3D human parametric model toolbox and benchmark. +- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab self-supervised learning toolbox and benchmark. +- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab model compression toolbox and benchmark. +- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab fewshot learning toolbox and benchmark. +- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark. +- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark. +- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark. +- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox. +- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox. +- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab model deployment framework. diff --git a/det-mmdetection-tmi/docker/Dockerfile b/det-mmdetection-tmi/docker/Dockerfile deleted file mode 100644 index 5ee7a37..0000000 --- a/det-mmdetection-tmi/docker/Dockerfile +++ /dev/null @@ -1,25 +0,0 @@ -ARG PYTORCH="1.6.0" -ARG CUDA="10.1" -ARG CUDNN="7" - -FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel - -ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX" -ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all" -ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" - -RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \ - && apt-get clean \ - && rm -rf /var/lib/apt/lists/* - -# Install MMCV -RUN pip install --no-cache-dir --upgrade pip wheel setuptools -RUN pip install --no-cache-dir mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html - -# Install MMDetection -RUN conda clean --all -RUN git clone https://github.com/open-mmlab/mmdetection.git /mmdetection -WORKDIR /mmdetection -ENV FORCE_CUDA="1" -RUN pip install --no-cache-dir -r requirements/build.txt -RUN pip install --no-cache-dir -e . diff --git a/det-mmdetection-tmi/docker/Dockerfile.cuda102 b/det-mmdetection-tmi/docker/Dockerfile.cuda102 new file mode 100644 index 0000000..2fd8643 --- /dev/null +++ b/det-mmdetection-tmi/docker/Dockerfile.cuda102 @@ -0,0 +1,42 @@ +ARG PYTORCH="1.8.1" +ARG CUDA="10.2" +ARG CUDNN="7" + +FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel + +# mmcv>=1.3.17, <=1.5.0 +ARG MMCV="1.4.3" +ARG YMIR="1.1.0" + +ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX" +ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all" +ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" +ENV LANG=C.UTF-8 +ENV FORCE_CUDA="1" +ENV PYTHONPATH=. +ENV YMIR_VERSION=${YMIR} +# Set timezone +RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \ + && echo 'Asia/Shanghai' >/etc/timezone + +RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC \ + && apt-get update \ + && apt-get install -y build-essential ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \ + && apt-get clean \ + && rm -rf /var/lib/apt/lists/* + +# Install ymir-exc sdk and MMCV (no cu102/torch1.8.1, use torch1.8.0 instead) +RUN pip install --no-cache-dir --upgrade pip wheel setuptools \ + && pip install --no-cache-dir mmcv-full==${MMCV} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html \ + && pip install "git+https://github.com/modelai/ymir-executor-sdk.git@ymir1.3.0" \ + && conda clean --all + +# Install det-mmdetection-tmi +COPY . /app/ +WORKDIR /app +RUN pip install --no-cache-dir -r requirements/runtime.txt \ + && mkdir /img-man \ + && mv *-template.yaml /img-man \ + && echo "cd /app && python3 start.py" > /usr/bin/start.sh + +CMD bash /usr/bin/start.sh diff --git a/det-mmdetection-tmi/docker/Dockerfile.cuda111 b/det-mmdetection-tmi/docker/Dockerfile.cuda111 new file mode 100644 index 0000000..2306105 --- /dev/null +++ b/det-mmdetection-tmi/docker/Dockerfile.cuda111 @@ -0,0 +1,49 @@ +ARG PYTORCH="1.8.0" +ARG CUDA="11.1" +ARG CUDNN="8" + +FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-runtime + +# mmcv>=1.3.17, <=1.5.0 +ARG MMCV="1.4.3" +ARG YMIR="1.1.0" + +ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX" +ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all" +ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" +ENV FORCE_CUDA="1" +ENV PYTHONPATH=. +ENV YMIR_VERSION=${YMIR} +# Set timezone +RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \ + && echo 'Asia/Shanghai' >/etc/timezone + +# Install apt package +RUN apt-get update && apt-get install -y build-essential ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \ + && apt-get clean \ + && rm -rf /var/lib/apt/lists/* + +# Install ymir-exc sdk and MMCV +RUN pip install --no-cache-dir --upgrade pip wheel setuptools \ + && pip install --no-cache-dir mmcv-full==${MMCV} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html \ + && pip install "git+https://github.com/modelai/ymir-executor-sdk.git@ymir1.3.0" \ + && conda clean --all + +# Install det-mmdetection-tmi +COPY . /app/ +WORKDIR /app +RUN pip install --no-cache-dir -r requirements/runtime.txt \ + && mkdir /img-man \ + && mv *-template.yaml /img-man \ + && echo "cd /app && python3 start.py" > /usr/bin/start.sh + +# Download coco-pretrained yolox weight to /weights +# view https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox for detail +# RUN apt-get update && apt install -y wget && rm -rf /var/lib/apt/lists/* +# RUN mkdir -p /weights && cd /weights \ +# && wget https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_tiny_8x8_300e_coco/yolox_tiny_8x8_300e_coco_20211124_171234-b4047906.pth \ +# && wget https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_s_8x8_300e_coco/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth \ +# && wget https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth \ +# && wget https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_x_8x8_300e_coco/yolox_x_8x8_300e_coco_20211126_140254-1ef88d67.pth + +CMD bash /usr/bin/start.sh diff --git a/det-mmdetection-tmi/infer-template.yaml b/det-mmdetection-tmi/infer-template.yaml new file mode 100644 index 0000000..de78f9c --- /dev/null +++ b/det-mmdetection-tmi/infer-template.yaml @@ -0,0 +1,4 @@ +shm_size: '128G' +export_format: 'ark:raw' +cfg_options: '' +conf_threshold: 0.2 diff --git a/det-mmdetection-tmi/mining-template.yaml b/det-mmdetection-tmi/mining-template.yaml new file mode 100644 index 0000000..693463b --- /dev/null +++ b/det-mmdetection-tmi/mining-template.yaml @@ -0,0 +1,5 @@ +shm_size: '128G' +export_format: 'ark:raw' +cfg_options: '' +mining_algorithm: cald +class_distribution_scores: '' # 1.0,1.0,0.1,0.2 diff --git a/det-mmdetection-tmi/mining/util.py b/det-mmdetection-tmi/mining/util.py new file mode 100644 index 0000000..e69de29 diff --git a/det-mmdetection-tmi/mining/ymir_mining.py b/det-mmdetection-tmi/mining/ymir_mining.py new file mode 100644 index 0000000..506506d --- /dev/null +++ b/det-mmdetection-tmi/mining/ymir_mining.py @@ -0,0 +1,412 @@ +""" +data augmentations for CALD method, including horizontal_flip, rotate(5'), cutout +official code: https://github.com/we1pingyu/CALD/blob/master/cald/cald_helper.py +""" +import os +import random +import sys +from typing import Any, Callable, Dict, List, Tuple + +import cv2 +import numpy as np +import torch +import torch.distributed as dist +from easydict import EasyDict as edict +from mmcv.runner import init_dist +from mmdet.apis.test import collect_results_gpu +from mmdet.utils.util_ymir import BBOX, CV_IMAGE +from nptyping import NDArray +from scipy.stats import entropy +from tqdm import tqdm +from ymir_exc import monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config, get_ymir_process +from ymir_infer import YmirModel + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def intersect(boxes1: BBOX, boxes2: BBOX) -> NDArray: + ''' + Find intersection of every box combination between two sets of box + boxes1: bounding boxes 1, a tensor of dimensions (n1, 4) + boxes2: bounding boxes 2, a tensor of dimensions (n2, 4) + + Out: Intersection each of boxes1 with respect to each of boxes2, + a tensor of dimensions (n1, n2) + ''' + n1 = boxes1.shape[0] + n2 = boxes2.shape[0] + max_xy = np.minimum( + np.expand_dims(boxes1[:, 2:], axis=1).repeat(n2, axis=1), + np.expand_dims(boxes2[:, 2:], axis=0).repeat(n1, axis=0)) + + min_xy = np.maximum( + np.expand_dims(boxes1[:, :2], axis=1).repeat(n2, axis=1), + np.expand_dims(boxes2[:, :2], axis=0).repeat(n1, axis=0)) + inter = np.clip(max_xy - min_xy, a_min=0, a_max=None) # (n1, n2, 2) + return inter[:, :, 0] * inter[:, :, 1] # (n1, n2) + + +def horizontal_flip(image: CV_IMAGE, bbox: BBOX) \ + -> Tuple[CV_IMAGE, BBOX]: + """ + image: opencv image, [height,width,channels] + bbox: numpy.ndarray, [N,4] --> [x1,y1,x2,y2] + """ + image = image.copy() + + width = image.shape[1] + # Flip image horizontally + image = image[:, ::-1, :] + if len(bbox) > 0: + bbox = bbox.copy() + # Flip bbox horizontally + bbox[:, [0, 2]] = width - bbox[:, [2, 0]] + return image, bbox + + +def cutout(image: CV_IMAGE, + bbox: BBOX, + cut_num: int = 2, + fill_val: int = 0, + bbox_remove_thres: float = 0.4, + bbox_min_thres: float = 0.1) -> Tuple[CV_IMAGE, BBOX]: + ''' + Cutout augmentation + image: A PIL image + boxes: bounding boxes, a tensor of dimensions (#objects, 4) + labels: labels of object, a tensor of dimensions (#objects) + fill_val: Value filled in cut out + bbox_remove_thres: Theshold to remove bbox cut by cutout + + Out: new image, new_boxes, new_labels + ''' + image = image.copy() + bbox = bbox.copy() + + if len(bbox) == 0: + return image, bbox + + original_h, original_w, original_channel = image.shape + count = 0 + for _ in range(50): + # Random cutout size: [0.15, 0.5] of original dimension + cutout_size_h = random.uniform(0.05 * original_h, 0.2 * original_h) + cutout_size_w = random.uniform(0.05 * original_w, 0.2 * original_w) + + # Random position for cutout + left = random.uniform(0, original_w - cutout_size_w) + right = left + cutout_size_w + top = random.uniform(0, original_h - cutout_size_h) + bottom = top + cutout_size_h + cutout = np.array([[float(left), float(top), float(right), float(bottom)]]) + + # Calculate intersect between cutout and bounding boxes + overlap_size = intersect(cutout, bbox) + area_boxes = (bbox[:, 2] - bbox[:, 0]) * (bbox[:, 3] - bbox[:, 1]) + ratio = overlap_size / (area_boxes + 1e-14) + # If all boxes have Iou greater than bbox_remove_thres, try again + if ratio.max() > bbox_remove_thres or ratio.max() < bbox_min_thres: + continue + + image[int(top):int(bottom), int(left):int(right), :] = fill_val + count += 1 + if count >= cut_num: + break + return image, bbox + + +def rotate(image: CV_IMAGE, bbox: BBOX, rot: float = 5) -> Tuple[CV_IMAGE, BBOX]: + image = image.copy() + bbox = bbox.copy() + h, w, c = image.shape + center = np.array([w / 2.0, h / 2.0]) + s = max(h, w) * 1.0 + trans = get_affine_transform(center, s, rot, [w, h]) + if len(bbox) > 0: + for i in range(bbox.shape[0]): + x1, y1 = affine_transform(bbox[i, :2], trans) + x2, y2 = affine_transform(bbox[i, 2:], trans) + x3, y3 = affine_transform(bbox[i, [2, 1]], trans) + x4, y4 = affine_transform(bbox[i, [0, 3]], trans) + bbox[i, :2] = [min(x1, x2, x3, x4), min(y1, y2, y3, y4)] + bbox[i, 2:] = [max(x1, x2, x3, x4), max(y1, y2, y3, y4)] + image = cv2.warpAffine(image, trans, (w, h), flags=cv2.INTER_LINEAR) + return image, bbox + + +def get_3rd_point(a: NDArray, b: NDArray) -> NDArray: + direct = a - b + return b + np.array([-direct[1], direct[0]], dtype=np.float32) + + +def get_dir(src_point: NDArray, rot_rad: float) -> List: + sn, cs = np.sin(rot_rad), np.cos(rot_rad) + + src_result = [0, 0] + src_result[0] = src_point[0] * cs - src_point[1] * sn + src_result[1] = src_point[0] * sn + src_point[1] * cs + + return src_result + + +def transform_preds(coords: NDArray, center: NDArray, scale: Any, rot: float, output_size: List) -> NDArray: + trans = get_affine_transform(center, scale, rot, output_size, inv=True) + target_coords = affine_transform(coords, trans) + return target_coords + + +def get_affine_transform(center: NDArray, + scale: Any, + rot: float, + output_size: List, + shift: NDArray = np.array([0, 0], dtype=np.float32), + inv: bool = False) -> NDArray: + if not isinstance(scale, np.ndarray) and not isinstance(scale, list): + scale = np.array([scale, scale], dtype=np.float32) + + scale_tmp = scale + src_w = scale_tmp[0] + dst_w = output_size[0] + dst_h = output_size[1] + + rot_rad = np.pi * rot / 180 + src_dir = get_dir(np.array([0, src_w * -0.5], np.float32), rot_rad) + dst_dir = np.array([0, dst_w * -0.5], np.float32) + + src = np.zeros((3, 2), dtype=np.float32) + dst = np.zeros((3, 2), dtype=np.float32) + src[0, :] = center + scale_tmp * shift + src[1, :] = center + src_dir + scale_tmp * shift + dst[0, :] = [dst_w * 0.5, dst_h * 0.5] + dst[1, :] = np.array([dst_w * 0.5, dst_h * 0.5], np.float32) + dst_dir + + src[2:, :] = get_3rd_point(src[0, :], src[1, :]) + dst[2:, :] = get_3rd_point(dst[0, :], dst[1, :]) + + if inv: + trans = cv2.getAffineTransform(np.float32(dst), np.float32(src)) + else: + trans = cv2.getAffineTransform(np.float32(src), np.float32(dst)) + + return trans + + +def affine_transform(pt: NDArray, t: NDArray) -> NDArray: + new_pt = np.array([pt[0], pt[1], 1.], dtype=np.float32).T + new_pt = np.dot(t, new_pt) + return new_pt[:2] + + +def resize(img: CV_IMAGE, boxes: BBOX, ratio: float = 0.8) -> Tuple[CV_IMAGE, BBOX]: + """ + ratio: <= 1.0 + """ + assert ratio <= 1.0, f'resize ratio {ratio} must <= 1.0' + + h, w, _ = img.shape + ow = int(w * ratio) + oh = int(h * ratio) + resize_img = cv2.resize(img, (ow, oh)) + new_img = np.zeros_like(img) + new_img[:oh, :ow] = resize_img + + if len(boxes) == 0: + return new_img, boxes + else: + return new_img, boxes * ratio + + +def get_ious(boxes1: BBOX, boxes2: BBOX) -> NDArray: + """ + args: + boxes1: np.array, (N, 4), xyxy + boxes2: np.array, (M, 4), xyxy + return: + iou: np.array, (N, M) + """ + area1 = (boxes1[:, 2] - boxes1[:, 0]) * (boxes1[:, 3] - boxes1[:, 1]) + area2 = (boxes2[:, 2] - boxes2[:, 0]) * (boxes2[:, 3] - boxes2[:, 1]) + iner_area = intersect(boxes1, boxes2) + area1 = area1.reshape(-1, 1).repeat(area2.shape[0], axis=1) + area2 = area2.reshape(1, -1).repeat(area1.shape[0], axis=0) + iou = iner_area / (area1 + area2 - iner_area + 1e-14) + return iou + + +def split_result(result: NDArray) -> Tuple[BBOX, NDArray, NDArray]: + if len(result) > 0: + bboxes = result[:, :4].astype(np.int32) + conf = result[:, 4] + class_id = result[:, 5] + else: + bboxes = np.zeros(shape=(0, 4), dtype=np.int32) + conf = np.zeros(shape=(0, 1), dtype=np.float32) + class_id = np.zeros(shape=(0, 1), dtype=np.int32) + + return bboxes, conf, class_id + + +class YmirMining(YmirModel): + def __init__(self, cfg: edict): + super().__init__(cfg) + if cfg.ymir.run_mining and cfg.ymir.run_infer: + mining_task_idx = 0 + # infer_task_idx = 1 + task_num = 2 + else: + mining_task_idx = 0 + # infer_task_idx = 0 + task_num = 1 + self.task_idx = mining_task_idx + self.task_num = task_num + + def mining(self): + with open(self.cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + if RANK == -1: + N = len(images) + tbar = tqdm(images) + else: + images_rank = images[RANK::WORLD_SIZE] + N = len(images_rank) + if RANK == 0: + tbar = tqdm(images_rank) + else: + tbar = images_rank + + monitor_gap = max(1, N // 100) + idx = -1 + beta = 1.3 + mining_result = [] + for asset_path in tbar: + # batch-level sync, avoid 30min time-out error + if LOCAL_RANK != -1: + dist.barrier() + + img = cv2.imread(asset_path) + # xyxy,conf,cls + result = self.predict(img) + bboxes, conf, _ = split_result(result) + if len(result) == 0: + # no result for the image without augmentation + mining_result.append((asset_path, -beta)) + continue + + consistency = 0.0 + aug_bboxes_dict, aug_results_dict = self.aug_predict(img, bboxes) + for key in aug_results_dict: + # no result for the image with augmentation f'{key}' + if len(aug_results_dict[key]) == 0: + consistency += beta + continue + + bboxes_key, conf_key, _ = split_result(aug_results_dict[key]) + cls_scores_aug = 1 - conf_key + cls_scores = 1 - conf + + consistency_per_aug = 2.0 + ious = get_ious(bboxes_key, aug_bboxes_dict[key]) + aug_idxs = np.argmax(ious, axis=0) + for origin_idx, aug_idx in enumerate(aug_idxs): + max_iou = ious[aug_idx, origin_idx] + if max_iou == 0: + consistency_per_aug = min(consistency_per_aug, beta) + p = cls_scores_aug[aug_idx] + q = cls_scores[origin_idx] + m = (p + q) / 2. + js = 0.5 * entropy([p, 1 - p], [m, 1 - m]) + 0.5 * entropy([q, 1 - q], [m, 1 - m]) + if js < 0: + js = 0 + consistency_box = max_iou + consistency_cls = 0.5 * \ + (conf[origin_idx] + conf_key[aug_idx]) * (1 - js) + consistency_per_inst = abs(consistency_box + consistency_cls - beta) + consistency_per_aug = min(consistency_per_aug, consistency_per_inst.item()) + + consistency += consistency_per_aug + + consistency /= len(aug_results_dict) + + mining_result.append((asset_path, consistency)) + idx += 1 + + if idx % monitor_gap == 0: + percent = get_ymir_process(stage=YmirStage.TASK, + p=idx / N, + task_idx=self.task_idx, + task_num=self.task_num) + monitor.write_monitor_logger(percent=percent) + + if RANK != -1: + mining_result = collect_results_gpu(mining_result, len(images)) + + return mining_result + + def predict(self, img: CV_IMAGE) -> NDArray: + """ + predict single image and return bbox information + img: opencv BGR, uint8 format + """ + results = self.infer(img) + + xyxy_conf_idx_list = [] + for idx, result in enumerate(results): + for line in result: + if any(np.isinf(line)): + continue + x1, y1, x2, y2, score = line + xyxy_conf_idx_list.append([x1, y1, x2, y2, score, idx]) + + if len(xyxy_conf_idx_list) == 0: + return np.zeros(shape=(0, 6), dtype=np.float32) + else: + return np.array(xyxy_conf_idx_list, dtype=np.float32) + + def aug_predict(self, image: CV_IMAGE, bboxes: BBOX) -> Tuple[Dict[str, BBOX], Dict[str, NDArray]]: + """ + for different augmentation methods: flip, cutout, rotate and resize + augment the image and bbox and use model to predict them. + + return the predict result and augment bbox. + """ + aug_dict: Dict[str, Callable] = dict(flip=horizontal_flip, cutout=cutout, rotate=rotate, resize=resize) + + aug_bboxes = dict() + aug_results = dict() + for key in aug_dict: + aug_img, aug_bbox = aug_dict[key](image, bboxes) + + aug_result = self.predict(aug_img) + aug_bboxes[key] = aug_bbox + aug_results[key] = aug_result + + return aug_bboxes, aug_results + + +def main(): + if LOCAL_RANK != -1: + init_dist(launcher='pytorch', backend="nccl" if dist.is_nccl_available() else "gloo") + + cfg = get_merged_config() + miner = YmirMining(cfg) + gpu_id: str = str(cfg.param.get('gpu_id', '0')) + gpu = int(gpu_id.split(',')[LOCAL_RANK]) + device = torch.device('cuda', gpu) + miner.model.to(device) + mining_result = miner.mining() + + if RANK in [0, -1]: + rw.write_mining_result(mining_result=mining_result) + + percent = get_ymir_process(stage=YmirStage.POSTPROCESS, p=1, task_idx=miner.task_idx, task_num=miner.task_num) + monitor.write_monitor_logger(percent=percent) + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/det-mmdetection-tmi/mining/ymir_mining_aldd.py b/det-mmdetection-tmi/mining/ymir_mining_aldd.py new file mode 100644 index 0000000..e69de29 diff --git a/det-mmdetection-tmi/mining_base.py b/det-mmdetection-tmi/mining_base.py new file mode 100644 index 0000000..27ba2f9 --- /dev/null +++ b/det-mmdetection-tmi/mining_base.py @@ -0,0 +1,137 @@ +import warnings +from typing import List + +import torch +import torch.nn.functional as F +from easydict import EasyDict as edict + + +def binary_classification_entropy(p: torch.Tensor) -> torch.Tensor: + """ + p: BCHW, the feature map after sigmoid, range in (0,1) + F.bce(x,y) = -(y * logx + (1-y) * log(1-x)) + """ + # return -(p * torch.log(p) + (1 - p) * torch.log(1 - p)) + return F.binary_cross_entropy(p, p, reduction='none') + + +def multiple_classification_entropy(p: torch.Tensor, activation: str) -> torch.Tensor: + """ + p: BCHW + + yolov5: sigmoid + nanodet: sigmoid + """ + assert activation in ['sigmoid', 'softmax'], f'classification type = {activation}, not in sigmoid, softmax' + + if activation == 'sigmoid': + entropy = F.binary_cross_entropy(p, p, reduction='none') + sum_entropy = torch.sum(entropy, dim=1, keepdim=True) + return sum_entropy + else: + # for origin aldd code, use tf.log(p + 1e-12) + entropy = -(p) * torch.log(p + 1e-7) + sum_entropy = torch.sum(entropy, dim=1, keepdim=True) + return sum_entropy + + +class FeatureMapBasedMining(object): + + def __init__(self, ymir_cfg: edict): + self.ymir_cfg = ymir_cfg + + def mining(self, feature_maps: List[torch.Tensor]) -> torch.Tensor: + raise Exception('not implement') + + +class ALDDMining(FeatureMapBasedMining): + """ + Active Learning for Deep Detection Neural Networks (ICCV 2019) + official code: https://gitlab.com/haghdam/deep_active_learning + + change from tensorflow code to pytorch code + 1. average pooling changed, pad or not? symmetrical pad or not? + 2. max pooling changed, ceil or not? + 3. the resize shape for aggregate feature map + + those small change cause 20%-40% difference for P@N, N=100 for total 1000 images. + P@5: 0.2 + P@10: 0.3 + P@20: 0.35 + P@50: 0.5 + P@100: 0.59 + P@200: 0.73 + P@500: 0.848 + """ + + def __init__(self, ymir_cfg: edict, resize_shape: List[int]): + super().__init__(ymir_cfg) + self.resize_shape = resize_shape + self.max_pool_size = 32 + self.avg_pool_size = 9 + self.align_corners = False + self.num_classes = len(ymir_cfg.param.class_names) + + def extract_conf(self, feature_maps: List[torch.Tensor], format='yolov5') -> List[torch.Tensor]: + """ + extract confidence feature map before sigmoid. + """ + if format == 'yolov5': + # feature_maps: [bs, 3, height, width, xywh + conf + num_classes] + return [f[:, :, :, :, 4] for f in feature_maps] + else: + warnings.warn(f'unknown feature map format {format}') + + return feature_maps + + def mining(self, feature_maps: List[torch.Tensor]) -> torch.Tensor: + """mining for feature maps + feature_maps: [BCHW] + 1. resizing followed by sigmoid + 2. get mining score + """ + # fmap = [Batch size, anchor number = 3, height, width, 5 + class_number] + + list_tmp = [] + for fmap in feature_maps: + resized_fmap = F.interpolate(fmap, self.resize_shape, mode='bilinear', align_corners=self.align_corners) + list_tmp.append(resized_fmap) + conf = torch.cat(list_tmp, dim=1).sigmoid() + scores = self.get_mining_score(conf) + return scores + + def get_mining_score(self, confidence_feature_map: torch.Tensor) -> torch.Tensor: + """ + confidence_feature_map: BCHW, value in (0, 1) + 1. A=sum(avg(entropy(fmap))) B,1,H,W + 2. B=sum(entropy(avg(fmap))) B,1,H,W + 3. C=max(B-A) B,1,h,w + 4. mean(C) B + """ + avg_entropy = F.avg_pool2d(self.get_entropy(confidence_feature_map), + kernel_size=self.avg_pool_size, + stride=1, + padding=0) + sum_avg_entropy = torch.sum(avg_entropy, dim=1, keepdim=True) + + entropy_avg = self.get_entropy( + F.avg_pool2d(confidence_feature_map, kernel_size=self.avg_pool_size, stride=1, padding=0)) + sum_entropy_avg = torch.sum(entropy_avg, dim=1, keepdim=True) + + uncertainty = sum_entropy_avg - sum_avg_entropy + + max_uncertainty = F.max_pool2d(uncertainty, + kernel_size=self.max_pool_size, + stride=self.max_pool_size, + padding=0, + ceil_mode=False) + + return torch.mean(max_uncertainty, dim=(1, 2, 3)) + + def get_entropy(self, feature_map: torch.Tensor) -> torch.Tensor: + if self.num_classes == 1: + # binary cross entropy + return binary_classification_entropy(feature_map) + else: + # multi-class cross entropy + return multiple_classification_entropy(feature_map, activation='sigmoid') diff --git a/det-mmdetection-tmi/mmdet/core/evaluation/eval_hooks.py b/det-mmdetection-tmi/mmdet/core/evaluation/eval_hooks.py index 7c1fbe9..81a36bb 100644 --- a/det-mmdetection-tmi/mmdet/core/evaluation/eval_hooks.py +++ b/det-mmdetection-tmi/mmdet/core/evaluation/eval_hooks.py @@ -6,7 +6,10 @@ import torch.distributed as dist from mmcv.runner import DistEvalHook as BaseDistEvalHook from mmcv.runner import EvalHook as BaseEvalHook +from mmdet.utils.util_ymir import write_ymir_training_result from torch.nn.modules.batchnorm import _BatchNorm +from ymir_exc import monitor +from ymir_exc.util import YmirStage, get_ymir_process def _calc_dynamic_intervals(start_interval, dynamic_interval_list): @@ -43,10 +46,29 @@ def before_train_epoch(self, runner): self._decide_interval(runner) super().before_train_epoch(runner) + def after_train_epoch(self, runner): + """Report the training process for ymir""" + if self.by_epoch: + monitor_interval = max(1, runner.max_epochs // 1000) + if runner.epoch % monitor_interval == 0: + percent = get_ymir_process( + stage=YmirStage.TASK, p=runner.epoch / runner.max_epochs) + monitor.write_monitor_logger(percent=percent) + super().after_train_epoch(runner) + def before_train_iter(self, runner): self._decide_interval(runner) super().before_train_iter(runner) + def after_train_iter(self, runner): + if not self.by_epoch: + monitor_interval = max(1, runner.max_iters // 1000) + if runner.iter % monitor_interval == 0: + percent = get_ymir_process( + stage=YmirStage.TASK, p=runner.iter / runner.max_iters) + monitor.write_monitor_logger(percent=percent) + super().after_train_iter(runner) + def _do_evaluate(self, runner): """perform evaluation and save ckpt.""" if not self._should_evaluate(runner): @@ -56,11 +78,18 @@ def _do_evaluate(self, runner): results = single_gpu_test(runner.model, self.dataloader, show=False) runner.log_buffer.output['eval_iter_num'] = len(self.dataloader) key_score = self.evaluate(runner, results) + write_ymir_training_result(last=False, key_score=key_score) # the key_score may be `None` so it needs to skip the action to save # the best checkpoint if self.save_best and key_score: self._save_ckpt(runner, key_score) + # TODO obtain best_score from runner + # best_score = runner.meta['hook_msgs'].get( + # 'best_score', self.init_value_map[self.rule]) + # if self.compare_func(key_score, best_score): + # write_ymir_training_result(key_score) + # Note: Considering that MMCV's EvalHook updated its interface in V1.3.16, # in order to avoid strong version dependency, we did not directly @@ -87,10 +116,29 @@ def before_train_epoch(self, runner): self._decide_interval(runner) super().before_train_epoch(runner) + def after_train_epoch(self, runner): + """Report the training process for ymir""" + if self.by_epoch and runner.rank == 0: + monitor_interval = max(1, runner.max_epochs // 1000) + if runner.epoch % monitor_interval == 0: + percent = get_ymir_process( + stage=YmirStage.TASK, p=runner.epoch / runner.max_epochs) + monitor.write_monitor_logger(percent=percent) + super().after_train_epoch(runner) + def before_train_iter(self, runner): self._decide_interval(runner) super().before_train_iter(runner) + def after_train_iter(self, runner): + if not self.by_epoch and runner.rank == 0: + monitor_interval = max(1, runner.max_iters // 1000) + if runner.iter % monitor_interval == 0: + percent = get_ymir_process( + stage=YmirStage.TASK, p=runner.iter / runner.max_iters) + monitor.write_monitor_logger(percent=percent) + super().after_train_iter(runner) + def _do_evaluate(self, runner): """perform evaluation and save ckpt.""" # Synchronization of BatchNorm's buffer (running_mean @@ -123,8 +171,14 @@ def _do_evaluate(self, runner): print('\n') runner.log_buffer.output['eval_iter_num'] = len(self.dataloader) key_score = self.evaluate(runner, results) - + write_ymir_training_result(last=False, key_score=key_score) # the key_score may be `None` so it needs to skip # the action to save the best checkpoint if self.save_best and key_score: self._save_ckpt(runner, key_score) + + # TODO obtain best_score from runner + # best_score = runner.meta['hook_msgs'].get( + # 'best_score', self.init_value_map[self.rule]) + # if self.compare_func(key_score, best_score): + # write_ymir_training_result(key_score) diff --git a/det-mmdetection-tmi/mmdet/datasets/__init__.py b/det-mmdetection-tmi/mmdet/datasets/__init__.py index f251d07..ff66046 100644 --- a/det-mmdetection-tmi/mmdet/datasets/__init__.py +++ b/det-mmdetection-tmi/mmdet/datasets/__init__.py @@ -15,6 +15,7 @@ from .voc import VOCDataset from .wider_face import WIDERFaceDataset from .xml_style import XMLDataset +from .ymir import YmirDataset __all__ = [ 'CustomDataset', 'XMLDataset', 'CocoDataset', 'DeepFashionDataset', @@ -24,5 +25,5 @@ 'ClassBalancedDataset', 'WIDERFaceDataset', 'DATASETS', 'PIPELINES', 'build_dataset', 'replace_ImageToTensor', 'get_loading_pipeline', 'NumClassCheckHook', 'CocoPanopticDataset', 'MultiImageMixDataset', - 'OpenImagesDataset', 'OpenImagesChallengeDataset' + 'OpenImagesDataset', 'OpenImagesChallengeDataset', 'YmirDataset' ] diff --git a/det-mmdetection-tmi/mmdet/datasets/coco.py b/det-mmdetection-tmi/mmdet/datasets/coco.py index efd6949..7de1cdb 100644 --- a/det-mmdetection-tmi/mmdet/datasets/coco.py +++ b/det-mmdetection-tmi/mmdet/datasets/coco.py @@ -3,6 +3,7 @@ import io import itertools import logging +import os import os.path as osp import tempfile import warnings @@ -12,7 +13,6 @@ import numpy as np from mmcv.utils import print_log from terminaltables import AsciiTable - from mmdet.core import eval_recalls from .api_wrappers import COCO, COCOeval from .builder import DATASETS @@ -592,4 +592,14 @@ def evaluate(self, f'{ap[4]:.3f} {ap[5]:.3f}') if tmp_dir is not None: tmp_dir.cleanup() + + COCO_EVAL_TMP_FILE = os.getenv('COCO_EVAL_TMP_FILE') + if COCO_EVAL_TMP_FILE is not None: + mmcv.dump(eval_results, COCO_EVAL_TMP_FILE, file_format='json') + else: + raise Exception( + 'please set valid environment variable COCO_EVAL_TMP_FILE to write result into json file') + + print_log( + f'\n write eval result to {COCO_EVAL_TMP_FILE}', logger=logger) return eval_results diff --git a/det-mmdetection-tmi/mmdet/datasets/ymir.py b/det-mmdetection-tmi/mmdet/datasets/ymir.py new file mode 100644 index 0000000..9215624 --- /dev/null +++ b/det-mmdetection-tmi/mmdet/datasets/ymir.py @@ -0,0 +1,186 @@ +# Copyright (c) OpenMMLab voc.py. All rights reserved. +# wangjiaxin 2022-04-25 + +import os.path as osp +import imagesize + +import json +from .builder import DATASETS +from .api_wrappers import COCO +from .coco import CocoDataset + + +@DATASETS.register_module() +class YmirDataset(CocoDataset): + """ + converted dataset by ymir system 1.0.0 + + /in/assets: image files directory + /in/annotations: annotation files directory + /in/train-index.tsv: image_file \t annotation_file + /in/val-index.tsv: image_file \t annotation_file + """ + + def __init__(self, + min_size=0, + ann_prefix='annotations', + **kwargs): + self.min_size = min_size + self.ann_prefix = ann_prefix + super(YmirDataset, self).__init__(**kwargs) + + def load_annotations(self, ann_file): + """Load annotation from TXT style ann_file. + + Args: + ann_file (str): Path of TXT file. + + Returns: + list[dict]: Annotation info from TXT file. + """ + + images = [] + categories = [] + # category_id is from 1 for coco, not 0 + for i, name in enumerate(self.CLASSES): + categories.append({'supercategory': 'none', + 'id': i+1, + 'name': name}) + + annotations = [] + instance_counter = 1 + image_counter = 1 + + with open(ann_file, 'r') as fp: + lines = fp.readlines() + + for line in lines: + # split any white space + img_path, ann_path = line.strip().split() + width, height = imagesize.get(img_path) + images.append( + dict(id=image_counter, + file_name=img_path, + ann_path=ann_path, + width=width, + height=height)) + + try: + anns = self.get_txt_ann_info(ann_path) + except Exception as e: + print(f'bad annotation for {ann_path} with {e}') + anns = [] + + for ann in anns: + ann['image_id'] = image_counter + ann['id'] = instance_counter + annotations.append(ann) + instance_counter += 1 + + image_counter += 1 + + # pycocotool coco init + self.coco = COCO() + self.coco.dataset['type'] = 'instances' + self.coco.dataset['categories'] = categories + self.coco.dataset['images'] = images + self.coco.dataset['annotations'] = annotations + self.coco.createIndex() + + # mmdetection coco init + # avoid the filter problem in CocoDataset, view coco_api.py for detail + self.coco.img_ann_map = self.coco.imgToAnns + self.coco.cat_img_map = self.coco.catToImgs + + # get valid category_id (in annotation, start from 1, arbitary) + self.cat_ids = self.coco.get_cat_ids(cat_names=self.CLASSES) + # convert category_id to label(train_id, start from 0) + self.cat2label = {cat_id: i for i, cat_id in enumerate(self.cat_ids)} + self.img_ids = self.coco.get_img_ids() + # self.img_ids = list(self.coco.imgs.keys()) + assert len(self.img_ids) > 0, 'image number must > 0' + print(f'load {len(self.img_ids)} image from YMIR dataset') + + data_infos = [] + total_ann_ids = [] + for i in self.img_ids: + info = self.coco.load_imgs([i])[0] + info['filename'] = info['file_name'] + data_infos.append(info) + ann_ids = self.coco.get_ann_ids(img_ids=[i]) + total_ann_ids.extend(ann_ids) + assert len(set(total_ann_ids)) == len( + total_ann_ids), f"Annotation ids in '{ann_file}' are not unique!" + return data_infos + + def dump(self, ann_file): + with open(ann_file, 'w') as fp: + json.dump(self.coco.dataset, fp) + + def get_ann_path_from_img_path(self, img_path): + img_id = osp.splitext(osp.basename(img_path))[0] + return osp.join(self.data_root, self.ann_prefix, img_id+'.txt') + + def get_txt_ann_info(self, txt_path): + """Get annotation from TXT file by index. + + Args: + idx (int): Index of data. + + Returns: + dict: Annotation info of specified index. + """ + anns = [] + if osp.exists(txt_path): + with open(txt_path, 'r') as fp: + lines = fp.readlines() + else: + lines = [] + for line in lines: + obj = [int(x) for x in line.strip().split(',')[0:5]] + # YMIR category id starts from 0, coco from 1 + category_id, xmin, ymin, xmax, ymax = obj + h, w = ymax-ymin, xmax-xmin + ignore = 0 + if self.min_size: + assert not self.test_mode + if w < self.min_size or h < self.min_size: + ignore = 1 + + ann = dict( + segmentation=[ + [xmin, ymin, xmax, ymin, xmax, ymax, xmin, ymax]], + area=w*h, + iscrowd=0, + image_id=None, + bbox=[xmin, ymin, w, h], + category_id=category_id+1, # category id is from 1 for coco + id=None, + ignore=ignore + ) + anns.append(ann) + return anns + + def get_cat_ids(self, idx): + """Get category ids in TXT file by index. + + Args: + idx (int): Index of data. + + Returns: + list[int]: All categories in the image of specified index. + """ + + cat_ids = [] + txt_path = self.data_infos[idx]['ann_path'] + if osp.exists(txt_path): + with open(txt_path, 'r') as fp: + lines = fp.readlines() + else: + lines = [] + + for line in lines: + obj = [int(x) for x in line.strip().split(',')] + # label, xmin, ymin, xmax, ymax = obj + cat_ids.append(obj[0]) + + return cat_ids diff --git a/det-mmdetection-tmi/mmdet/utils/util_ymir.py b/det-mmdetection-tmi/mmdet/utils/util_ymir.py new file mode 100644 index 0000000..6cb9ae2 --- /dev/null +++ b/det-mmdetection-tmi/mmdet/utils/util_ymir.py @@ -0,0 +1,314 @@ +""" +utils function for ymir and yolov5 +""" +import glob +import logging +import os +import os.path as osp +from typing import Any, Iterable, List, Optional, Union + +import mmcv +import yaml +from easydict import EasyDict as edict +from mmcv import Config, ConfigDict +from nptyping import NDArray, Shape, UInt8 +from packaging.version import Version +from ymir_exc import result_writer as rw +from ymir_exc.util import get_merged_config + +BBOX = NDArray[Shape['*,4'], Any] +CV_IMAGE = NDArray[Shape['*,*,3'], UInt8] + + +def modify_mmcv_config(mmcv_cfg: Config, ymir_cfg: edict) -> None: + """ + useful for training process + - modify dataset config + - modify model output channel + - modify epochs, checkpoint, tensorboard config + """ + + def recursive_modify_attribute(mmcv_cfgdict: Union[Config, ConfigDict], attribute_key: str, attribute_value: Any): + """ + recursive modify mmcv_cfg: + 1. mmcv_cfg.attribute_key to attribute_value + 2. mmcv_cfg.xxx.xxx.xxx.attribute_key to attribute_value (recursive) + 3. mmcv_cfg.xxx[i].attribute_key to attribute_value (i=0, 1, 2 ...) + 4. mmcv_cfg.xxx[i].xxx.xxx[j].attribute_key to attribute_value + """ + for key in mmcv_cfgdict: + if key == attribute_key: + mmcv_cfgdict[key] = attribute_value + logging.info(f'modify {mmcv_cfgdict}, {key} = {attribute_value}') + elif isinstance(mmcv_cfgdict[key], (Config, ConfigDict)): + recursive_modify_attribute(mmcv_cfgdict[key], attribute_key, attribute_value) + elif isinstance(mmcv_cfgdict[key], Iterable): + for cfg in mmcv_cfgdict[key]: + if isinstance(cfg, (Config, ConfigDict)): + recursive_modify_attribute(cfg, attribute_key, attribute_value) + + # modify dataset config + ymir_ann_files = dict(train=ymir_cfg.ymir.input.training_index_file, + val=ymir_cfg.ymir.input.val_index_file, + test=ymir_cfg.ymir.input.candidate_index_file) + + # validation may augment the image and use more gpu + # so set smaller samples_per_gpu for validation + samples_per_gpu = ymir_cfg.param.samples_per_gpu + workers_per_gpu = ymir_cfg.param.workers_per_gpu + mmcv_cfg.data.samples_per_gpu = samples_per_gpu + mmcv_cfg.data.workers_per_gpu = workers_per_gpu + + # modify model output channel + num_classes = len(ymir_cfg.param.class_names) + recursive_modify_attribute(mmcv_cfg.model, 'num_classes', num_classes) + + for split in ['train', 'val', 'test']: + ymir_dataset_cfg = dict(type='YmirDataset', + ann_file=ymir_ann_files[split], + img_prefix=ymir_cfg.ymir.input.assets_dir, + ann_prefix=ymir_cfg.ymir.input.annotations_dir, + classes=ymir_cfg.param.class_names, + data_root=ymir_cfg.ymir.input.root_dir, + filter_empty_gt=False) + # modify dataset config for `split` + mmdet_dataset_cfg = mmcv_cfg.data.get(split, None) + if mmdet_dataset_cfg is None: + continue + + if isinstance(mmdet_dataset_cfg, (list, tuple)): + for x in mmdet_dataset_cfg: + x.update(ymir_dataset_cfg) + else: + src_dataset_type = mmdet_dataset_cfg.type + if src_dataset_type in ['CocoDataset', 'YmirDataset']: + mmdet_dataset_cfg.update(ymir_dataset_cfg) + elif src_dataset_type in ['MultiImageMixDataset', 'RepeatDataset']: + mmdet_dataset_cfg.dataset.update(ymir_dataset_cfg) + else: + raise Exception(f'unsupported source dataset type {src_dataset_type}') + + # modify epochs, checkpoint, tensorboard config + if ymir_cfg.param.get('max_epochs', None): + mmcv_cfg.runner.max_epochs = int(ymir_cfg.param.max_epochs) + mmcv_cfg.checkpoint_config['out_dir'] = ymir_cfg.ymir.output.models_dir + tensorboard_logger = dict(type='TensorboardLoggerHook', log_dir=ymir_cfg.ymir.output.tensorboard_dir) + if len(mmcv_cfg.log_config['hooks']) <= 1: + mmcv_cfg.log_config['hooks'].append(tensorboard_logger) + else: + mmcv_cfg.log_config['hooks'][1].update(tensorboard_logger) + + # TODO save only the best top-k model weight files. + # modify evaluation and interval + val_interval: int = int(ymir_cfg.param.get('val_interval', 1)) + if val_interval > 0: + val_interval = min(val_interval, mmcv_cfg.runner.max_epochs) + else: + val_interval = 1 + + mmcv_cfg.evaluation.interval = val_interval + mmcv_cfg.evaluation.metric = ymir_cfg.param.get('metric', 'bbox') + + # save best top-k model weights files + # max_keep_ckpts <= 0 # save all checkpoints + max_keep_ckpts: int = int(ymir_cfg.param.get('max_keep_checkpoints', 1)) + mmcv_cfg.checkpoint_config.interval = mmcv_cfg.evaluation.interval + mmcv_cfg.checkpoint_config.max_keep_ckpts = max_keep_ckpts + + # TODO Whether to evaluating the AP for each class + # mmdet_cfg.evaluation.classwise = True + + # fix DDP error + mmcv_cfg.find_unused_parameters = True + + # set work dir + mmcv_cfg.work_dir = ymir_cfg.ymir.output.models_dir + + args_options = ymir_cfg.param.get("args_options", '') + cfg_options = ymir_cfg.param.get("cfg_options", '') + + # auto load offered weight file if not set by user! + if (args_options.find('--resume-from') == -1 and args_options.find('--load-from') == -1 + and cfg_options.find('load_from') == -1 and cfg_options.find('resume_from') == -1): # noqa: E129 + + weight_file = get_best_weight_file(ymir_cfg) + if weight_file: + if cfg_options: + cfg_options += f' load_from={weight_file}' + else: + cfg_options = f'load_from={weight_file}' + else: + logging.warning('no weight file used for training!') + + +def get_best_weight_file(cfg: edict) -> str: + """ + return the weight file path by priority + find weight file in cfg.param.pretrained_model_params or cfg.param.model_params_path + load coco-pretrained weight for yolox + """ + if cfg.ymir.run_training: + model_params_path: List[str] = cfg.param.get('pretrained_model_params', []) + else: + model_params_path = cfg.param.get('model_params_path', []) + + model_dir = cfg.ymir.input.models_dir + model_params_path = [ + osp.join(model_dir, p) for p in model_params_path + if osp.exists(osp.join(model_dir, p)) and p.endswith(('.pth', '.pt')) + ] + + # choose weight file by priority, best_xxx.pth > latest.pth > epoch_xxx.pth + best_pth_files = [f for f in model_params_path if osp.basename(f).startswith('best_')] + if len(best_pth_files) > 0: + return max(best_pth_files, key=os.path.getctime) + + epoch_pth_files = [f for f in model_params_path if osp.basename(f).startswith(('epoch_', 'iter_'))] + if len(epoch_pth_files) > 0: + return max(epoch_pth_files, key=os.path.getctime) + + if cfg.ymir.run_training: + weight_files = [f for f in glob.glob('/weights/**/*', recursive=True) if f.endswith(('.pth', '.pt'))] + + # load pretrained model weight for yolox only + model_name_splits = osp.basename(cfg.param.config_file).split('_') + if len(weight_files) > 0 and model_name_splits[0] == 'yolox': + yolox_weight_files = [ + f for f in weight_files if osp.basename(f).startswith(f'yolox_{model_name_splits[1]}') + ] + + if len(yolox_weight_files) == 0: + if model_name_splits[1] == 'nano': + # yolox_tiny_8x8_300e_coco_20211124_171234-b4047906.pth or yolox_tiny.py + yolox_weight_files = [f for f in weight_files if osp.basename(f).startswith('yolox_tiny')] + else: + yolox_weight_files = [f for f in weight_files if osp.basename(f).startswith('yolox_s')] + + if len(yolox_weight_files) > 0: + logging.info(f'load yolox pretrained weight {yolox_weight_files[0]}') + return yolox_weight_files[0] + return "" + + +def write_ymir_training_result(last: bool = False, key_score: Optional[float] = None): + YMIR_VERSION = os.environ.get('YMIR_VERSION', '1.2.0') + if Version(YMIR_VERSION) >= Version('1.2.0'): + _write_latest_ymir_training_result(last, key_score) + else: + _write_ancient_ymir_training_result(key_score) + + +def get_topk_checkpoints(files: List[str], k: int) -> List[str]: + """ + keep topk checkpoint files, remove other files. + """ + checkpoints_files = [f for f in files if f.endswith(('.pth', '.pt'))] + + best_pth_files = [f for f in checkpoints_files if osp.basename(f).startswith('best_')] + if len(best_pth_files) > 0: + # newest first + topk_best_pth_files = sorted(best_pth_files, key=os.path.getctime, reverse=True) + else: + topk_best_pth_files = [] + + epoch_pth_files = [f for f in checkpoints_files if osp.basename(f).startswith(('epoch_', 'iter_'))] + if len(epoch_pth_files) > 0: + topk_epoch_pth_files = sorted(epoch_pth_files, key=os.path.getctime, reverse=True) + else: + topk_epoch_pth_files = [] + + # python will check the length of list + return topk_best_pth_files[0:k] + topk_epoch_pth_files[0:k] + + +# TODO save topk checkpoints, fix invalid stage due to delete checkpoint +def _write_latest_ymir_training_result(last: bool = False, key_score: Optional[float] = None): + if key_score: + logging.info(f'key_score is {key_score}') + COCO_EVAL_TMP_FILE = os.getenv('COCO_EVAL_TMP_FILE') + if COCO_EVAL_TMP_FILE is None: + raise Exception('please set valid environment variable COCO_EVAL_TMP_FILE to write result into json file') + + eval_result = mmcv.load(COCO_EVAL_TMP_FILE) + # eval_result may be empty dict {}. + map = eval_result.get('bbox_mAP_50', 0) + + WORK_DIR = os.getenv('YMIR_MODELS_DIR') + if WORK_DIR is None or not osp.isdir(WORK_DIR): + raise Exception(f'please set valid environment variable YMIR_MODELS_DIR, invalid directory {WORK_DIR}') + + # assert only one model config file in work_dir + result_files = [f for f in glob.glob(osp.join(WORK_DIR, '*')) if osp.basename(f) != 'result.yaml'] + + if last: + # save all output file + ymir_cfg = get_merged_config() + max_keep_checkpoints = int(ymir_cfg.param.get('max_keep_checkpoints', 1)) + if max_keep_checkpoints > 0: + topk_checkpoints = get_topk_checkpoints(result_files, max_keep_checkpoints) + result_files = [f for f in result_files if not f.endswith(('.pth', '.pt'))] + topk_checkpoints + + result_files = [osp.basename(f) for f in result_files] + rw.write_model_stage(files=result_files, mAP=float(map), stage_name='last') + else: + result_files = [osp.basename(f) for f in result_files] + # save newest weight file in format epoch_xxx.pth or iter_xxx.pth + weight_files = [ + osp.join(WORK_DIR, f) for f in result_files if f.startswith(('iter_', 'epoch_')) and f.endswith('.pth') + ] + + if len(weight_files) > 0: + newest_weight_file = osp.basename(max(weight_files, key=os.path.getctime)) + + stage_name = osp.splitext(newest_weight_file)[0] + training_result_file = osp.join(WORK_DIR, 'result.yaml') + if osp.exists(training_result_file): + with open(training_result_file, 'r') as f: + training_result = yaml.safe_load(f) + model_stages = training_result.get('model_stages', {}) + else: + model_stages = {} + + if stage_name not in model_stages: + config_files = [f for f in result_files if f.endswith('.py')] + rw.write_model_stage(files=[newest_weight_file] + config_files, mAP=float(map), stage_name=stage_name) + + +def _write_ancient_ymir_training_result(key_score: Optional[float] = None): + if key_score: + logging.info(f'key_score is {key_score}') + + COCO_EVAL_TMP_FILE = os.getenv('COCO_EVAL_TMP_FILE') + if COCO_EVAL_TMP_FILE is None: + raise Exception('please set valid environment variable COCO_EVAL_TMP_FILE to write result into json file') + + eval_result = mmcv.load(COCO_EVAL_TMP_FILE) + # eval_result may be empty dict {}. + map = eval_result.get('bbox_mAP_50', 0) + + ymir_cfg = get_merged_config() + WORK_DIR = ymir_cfg.ymir.output.models_dir + + # assert only one model config file in work_dir + result_files = [f for f in glob.glob(osp.join(WORK_DIR, '*')) if osp.basename(f) != 'result.yaml'] + + max_keep_checkpoints = int(ymir_cfg.param.get('max_keep_checkpoints', 1)) + if max_keep_checkpoints > 0: + topk_checkpoints = get_topk_checkpoints(result_files, max_keep_checkpoints) + result_files = [f for f in result_files if not f.endswith(('.pth', '.pt'))] + topk_checkpoints + + # convert to basename + result_files = [osp.basename(f) for f in result_files] + + training_result_file = osp.join(WORK_DIR, 'result.yaml') + if osp.exists(training_result_file): + with open(training_result_file, 'r') as f: + training_result = yaml.safe_load(f) + + training_result['model'] = result_files + training_result['map'] = max(map, training_result['map']) + else: + training_result = dict(model=result_files, map=map) + + with open(training_result_file, 'w') as f: + yaml.safe_dump(training_result, f) diff --git a/det-mmdetection-tmi/requirements/runtime.txt b/det-mmdetection-tmi/requirements/runtime.txt index f7a2cc7..cf0fac6 100644 --- a/det-mmdetection-tmi/requirements/runtime.txt +++ b/det-mmdetection-tmi/requirements/runtime.txt @@ -2,4 +2,10 @@ matplotlib numpy pycocotools six +scipy terminaltables +easydict +nptyping +imagesize>=1.3.0 +future +tensorboard>=2.5.0 diff --git a/det-mmdetection-tmi/start.py b/det-mmdetection-tmi/start.py new file mode 100644 index 0000000..13402f2 --- /dev/null +++ b/det-mmdetection-tmi/start.py @@ -0,0 +1,73 @@ +import logging +import os +import subprocess +import sys + +from easydict import EasyDict as edict +from ymir_exc import monitor +from ymir_exc.util import find_free_port, get_merged_config + + +def start(cfg: edict) -> int: + logging.info(f'merged config: {cfg}') + + if cfg.ymir.run_training: + _run_training() + elif cfg.ymir.run_mining or cfg.ymir.run_infer: + if cfg.ymir.run_mining: + _run_mining(cfg) + if cfg.ymir.run_infer: + _run_infer() + else: + logging.warning('no task running') + + return 0 + + +def _run_training() -> None: + command = 'python3 ymir_train.py' + logging.info(f'start training: {command}') + subprocess.run(command.split(), check=True) + + # if task done, write 100% percent log + monitor.write_monitor_logger(percent=1.0) + logging.info("training finished") + + +def _run_mining(cfg: edict) -> None: + gpu_id: str = str(cfg.param.get('gpu_id', '0')) + gpu_count = len(gpu_id.split(',')) + mining_algorithm: str = cfg.param.get('mining_algorithm', 'aldd') + + supported_mining_algorithm = ['cald', 'aldd', 'random','entropy'] + assert mining_algorithm in supported_mining_algorithm, f'unknown mining_algorithm {mining_algorithm}, not in {supported_mining_algorithm}' + if gpu_count <= 1: + command = f'python3 ymir_mining_{mining_algorithm}.py' + else: + port = find_free_port() + command = f'python3 -m torch.distributed.launch --nproc_per_node {gpu_count} --master_port {port} ymir_mining_{mining_algorithm}.py' # noqa + + logging.info(f'start mining: {command}') + subprocess.run(command.split(), check=True) + logging.info("mining finished") + + +def _run_infer() -> None: + command = 'python3 ymir_infer.py' + logging.info(f'start infer: {command}') + subprocess.run(command.split(), check=True) + logging.info("infer finished") + + +if __name__ == '__main__': + logging.basicConfig(stream=sys.stdout, + format='%(levelname)-8s: [%(asctime)s] %(message)s', + datefmt='%Y%m%d-%H:%M:%S', + level=logging.INFO) + + cfg = get_merged_config() + os.environ.setdefault('YMIR_MODELS_DIR', cfg.ymir.output.models_dir) + os.environ.setdefault('COCO_EVAL_TMP_FILE', os.path.join( + cfg.ymir.output.root_dir, 'eval_tmp.json')) + os.environ.setdefault('PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION', 'python') + sys.exit(start(cfg)) diff --git a/det-mmdetection-tmi/tools/train.py b/det-mmdetection-tmi/tools/train.py index b9e9981..78fbe46 100644 --- a/det-mmdetection-tmi/tools/train.py +++ b/det-mmdetection-tmi/tools/train.py @@ -11,12 +11,13 @@ from mmcv import Config, DictAction from mmcv.runner import get_dist_info, init_dist from mmcv.utils import get_git_hash - from mmdet import __version__ from mmdet.apis import init_random_seed, set_random_seed, train_detector from mmdet.datasets import build_dataset from mmdet.models import build_detector from mmdet.utils import collect_env, get_root_logger, setup_multi_processes +from mmdet.utils.util_ymir import modify_mmcv_config +from ymir_exc.util import get_merged_config def parse_args(): @@ -96,8 +97,11 @@ def parse_args(): def main(): args = parse_args() - + ymir_cfg = get_merged_config() cfg = Config.fromfile(args.config) + # modify mmdet config from file + modify_mmcv_config(mmcv_cfg=cfg, ymir_cfg=ymir_cfg) + if args.cfg_options is not None: cfg.merge_from_dict(args.cfg_options) diff --git a/det-mmdetection-tmi/training-template.yaml b/det-mmdetection-tmi/training-template.yaml new file mode 100644 index 0000000..05b11b2 --- /dev/null +++ b/det-mmdetection-tmi/training-template.yaml @@ -0,0 +1,12 @@ +shm_size: '128G' +export_format: 'ark:raw' +samples_per_gpu: 16 # batch size per gpu +workers_per_gpu: 4 +max_epochs: 100 +config_file: 'configs/yolox/yolox_tiny_8x8_300e_coco.py' +args_options: '' +cfg_options: '' +metric: 'bbox' +val_interval: 1 # <0 means evaluation every interval +max_keep_checkpoints: 1 # <0 means save all weight file, 1 means save last and best weight files, k means save topk best weight files and topk epoch/step weigth files +ymir_saved_file_patterns: '' # custom saved files, support python regular expression, use , to split multiple pattern diff --git a/det-mmdetection-tmi/ymir_infer.py b/det-mmdetection-tmi/ymir_infer.py new file mode 100644 index 0000000..939e5bf --- /dev/null +++ b/det-mmdetection-tmi/ymir_infer.py @@ -0,0 +1,137 @@ +import argparse +import os.path as osp +import sys +import warnings +from typing import Any, List + +import cv2 +import numpy as np +from easydict import EasyDict as edict +from mmcv import DictAction +from mmdet.apis import inference_detector, init_detector +from mmdet.utils.util_ymir import get_best_weight_file +from tqdm import tqdm +from ymir_exc import dataset_reader as dr +from ymir_exc import env, monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config, get_ymir_process + + +def parse_option(cfg_options: str) -> dict: + parser = argparse.ArgumentParser(description='parse cfg options') + parser.add_argument('--cfg-options', + nargs='+', + action=DictAction, + help='override some settings in the used config, the key-value pair ' + 'in xxx=yyy format will be merged into config file. If the value to ' + 'be overwritten is a list, it should be like key="[a,b]" or key=a,b ' + 'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" ' + 'Note that the quotation marks are necessary and that no white space ' + 'is allowed.') + + args = parser.parse_args(f'--cfg-options {cfg_options}'.split()) + return args.cfg_options + + +def mmdet_result_to_ymir(results: List[Any], class_names: List[str]) -> List[rw.Annotation]: + """ + results: List[NDArray[Shape['*,5'], Any]] + """ + ann_list = [] + for idx, result in enumerate(results): + for line in result: + if any(np.isinf(line)): + continue + x1, y1, x2, y2, score = line + ann = rw.Annotation(class_name=class_names[idx], + score=score, + box=rw.Box(x=round(x1), y=round(y1), w=round(x2 - x1), h=round(y2 - y1))) + ann_list.append(ann) + return ann_list + + +def get_config_file(cfg): + if cfg.ymir.run_training: + model_params_path: List = cfg.param.get('pretrained_model_params', []) + else: + model_params_path: List = cfg.param.get('model_params_path', []) + + model_dir = cfg.ymir.input.models_dir + config_files = [ + osp.join(model_dir, p) for p in model_params_path if osp.exists(osp.join(model_dir, p)) and p.endswith(('.py')) + ] + + if len(config_files) > 0: + if len(config_files) > 1: + warnings.warn(f'multiple config file found! use {config_files[0]}') + return config_files[0] + else: + raise Exception(f'no config_file found in {model_dir} and {model_params_path}') + + +class YmirModel: + def __init__(self, cfg: edict): + self.cfg = cfg + + if cfg.ymir.run_mining and cfg.ymir.run_infer: + # mining_task_idx = 0 + infer_task_idx = 1 + task_num = 2 + else: + # mining_task_idx = 0 + infer_task_idx = 0 + task_num = 1 + + self.task_idx = infer_task_idx + self.task_num = task_num + + # Specify the path to model config and checkpoint file + config_file = get_config_file(cfg) + checkpoint_file = get_best_weight_file(cfg) + options = cfg.param.get('cfg_options', None) + cfg_options = parse_option(options) if options else None + + # current infer can only use one gpu!!! + gpu_ids = cfg.param.get('gpu_id', '0') + gpu_id = gpu_ids.split(',')[0] + # build the model from a config file and a checkpoint file + self.model = init_detector(config_file, checkpoint_file, device=f'cuda:{gpu_id}', cfg_options=cfg_options) + + def infer(self, img): + return inference_detector(self.model, img) + + +def main(): + cfg = get_merged_config() + + N = dr.items_count(env.DatasetType.CANDIDATE) + infer_result = dict() + model = YmirModel(cfg) + idx = -1 + + # write infer result + monitor_gap = max(1, N // 100) + conf_threshold = float(cfg.param.conf_threshold) + for asset_path, _ in tqdm(dr.item_paths(dataset_type=env.DatasetType.CANDIDATE)): + img = cv2.imread(asset_path) + result = model.infer(img) + raw_anns = mmdet_result_to_ymir(result, cfg.param.class_names) + + infer_result[asset_path] = [ann for ann in raw_anns if ann.score >= conf_threshold] + idx += 1 + + if idx % monitor_gap == 0: + percent = get_ymir_process(stage=YmirStage.TASK, + p=idx / N, + task_idx=model.task_idx, + task_num=model.task_num) + monitor.write_monitor_logger(percent=percent) + + rw.write_infer_result(infer_result=infer_result) + percent = get_ymir_process(stage=YmirStage.POSTPROCESS, p=1, task_idx=model.task_idx, task_num=model.task_num) + monitor.write_monitor_logger(percent=percent) + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/det-mmdetection-tmi/ymir_mining_aldd.py b/det-mmdetection-tmi/ymir_mining_aldd.py new file mode 100644 index 0000000..51b5c13 --- /dev/null +++ b/det-mmdetection-tmi/ymir_mining_aldd.py @@ -0,0 +1,74 @@ +import sys + +import torch +from easydict import EasyDict as edict +from mining_base import ALDDMining +from mmcv.parallel import collate, scatter +from mmdet.datasets import replace_ImageToTensor +from mmdet.datasets.pipelines import Compose +from mmdet.models.detectors import YOLOX +from ymir_exc.util import get_merged_config +from ymir_infer import YmirModel +from ymir_mining_random import RandomMiner + + +class ALDDMiner(RandomMiner): + + def __init__(self, cfg: edict): + super().__init__(cfg) + self.ymir_model = YmirModel(cfg) + mmdet_cfg = self.ymir_model.model.cfg + mmdet_cfg.data.test.pipeline = replace_ImageToTensor(mmdet_cfg.data.test.pipeline) + self.test_pipeline = Compose(mmdet_cfg.data.test.pipeline) + self.aldd_miner = ALDDMining(cfg, [640, 640]) + + def compute_score(self, asset_path: str) -> float: + dict_data = dict(img_info=dict(filename=asset_path), img_prefix=None) + pipeline_data = self.test_pipeline(dict_data) + data = collate([pipeline_data], samples_per_gpu=1) + # just get the actual data from DataContainer + data['img_metas'] = [img_metas.data[0] for img_metas in data['img_metas']] + data['img'] = [img.data[0] for img in data['img']] + # scatter to specified GPU + data = scatter(data, [self.device])[0] + + if isinstance(self.ymir_model.model, YOLOX): + # results = (cls_maps, reg_maps, iou_maps) + # cls_maps: [BxCx52x52, BxCx26x26, BxCx13x13] + # reg_maps: [Bx4x52x52, Bx4x26x26, Bx4x13x13] + # iou_maps: [Bx1x51x52, Bx1x26x26, Bx1x13x13] + results = self.ymir_model.model.forward_dummy(data['img'][0]) + feature_maps = [] + for cls, reg, iou in zip(results[0], results[1], results[2]): + maps = [reg, iou, cls] + feature_maps.append(torch.cat(maps, dim=1)) + mining_score = self.aldd_miner.mining(feature_maps) + + return mining_score.item() + else: + raise NotImplementedError( + 'aldd mining is currently not currently supported with {}, only support YOLOX'.format( + self.ymir_model.model.__class__.__name__)) + + # TODO support other SingleStageDetector + # if isinstance(self.ymir_model.model, SingleStageDetector): + # pass + # elif isinstance(self.ymir_model.model, TwoStageDetector): + # # (rpn_outs, roi_outs) + # # outs = self.ymir_model.model.forward_dummy(img) + # raise NotImplementedError('aldd mining is currently not currently supported TwoStageDetector {}'.format( + # self.ymir_model.model.__class__.__name__)) + # else: + # raise NotImplementedError('aldd mining is currently not currently supported with {}'.format( + # self.ymir_model.model.__class__.__name__)) + + +def main(): + cfg = get_merged_config() + miner = ALDDMiner(cfg) + miner.mining() + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/det-mmdetection-tmi/ymir_mining_cald.py b/det-mmdetection-tmi/ymir_mining_cald.py new file mode 100644 index 0000000..fe437ff --- /dev/null +++ b/det-mmdetection-tmi/ymir_mining_cald.py @@ -0,0 +1,412 @@ +""" +data augmentations for CALD method, including horizontal_flip, rotate(5'), cutout +official code: https://github.com/we1pingyu/CALD/blob/master/cald/cald_helper.py +""" +import os +import random +import sys +from typing import Any, Callable, Dict, List, Tuple + +import cv2 +import numpy as np +import torch +import torch.distributed as dist +from easydict import EasyDict as edict +from mmcv.runner import init_dist +from mmdet.apis.test import collect_results_gpu +from mmdet.utils.util_ymir import BBOX, CV_IMAGE +from nptyping import NDArray +from scipy.stats import entropy +from tqdm import tqdm +from ymir_exc import monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config, get_ymir_process +from ymir_infer import YmirModel + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def intersect(boxes1: BBOX, boxes2: BBOX) -> NDArray: + ''' + Find intersection of every box combination between two sets of box + boxes1: bounding boxes 1, a tensor of dimensions (n1, 4) + boxes2: bounding boxes 2, a tensor of dimensions (n2, 4) + + Out: Intersection each of boxes1 with respect to each of boxes2, + a tensor of dimensions (n1, n2) + ''' + n1 = boxes1.shape[0] + n2 = boxes2.shape[0] + max_xy = np.minimum( + np.expand_dims(boxes1[:, 2:], axis=1).repeat(n2, axis=1), + np.expand_dims(boxes2[:, 2:], axis=0).repeat(n1, axis=0)) + + min_xy = np.maximum( + np.expand_dims(boxes1[:, :2], axis=1).repeat(n2, axis=1), + np.expand_dims(boxes2[:, :2], axis=0).repeat(n1, axis=0)) + inter = np.clip(max_xy - min_xy, a_min=0, a_max=None) # (n1, n2, 2) + return inter[:, :, 0] * inter[:, :, 1] # (n1, n2) + + +def horizontal_flip(image: CV_IMAGE, bbox: BBOX) \ + -> Tuple[CV_IMAGE, BBOX]: + """ + image: opencv image, [height,width,channels] + bbox: numpy.ndarray, [N,4] --> [x1,y1,x2,y2] + """ + image = image.copy() + + width = image.shape[1] + # Flip image horizontally + image = image[:, ::-1, :] + if len(bbox) > 0: + bbox = bbox.copy() + # Flip bbox horizontally + bbox[:, [0, 2]] = width - bbox[:, [2, 0]] + return image, bbox + + +def cutout(image: CV_IMAGE, + bbox: BBOX, + cut_num: int = 2, + fill_val: int = 0, + bbox_remove_thres: float = 0.4, + bbox_min_thres: float = 0.1) -> Tuple[CV_IMAGE, BBOX]: + ''' + Cutout augmentation + image: A PIL image + boxes: bounding boxes, a tensor of dimensions (#objects, 4) + labels: labels of object, a tensor of dimensions (#objects) + fill_val: Value filled in cut out + bbox_remove_thres: Theshold to remove bbox cut by cutout + + Out: new image, new_boxes, new_labels + ''' + image = image.copy() + bbox = bbox.copy() + + if len(bbox) == 0: + return image, bbox + + original_h, original_w, original_channel = image.shape + count = 0 + for _ in range(50): + # Random cutout size: [0.15, 0.5] of original dimension + cutout_size_h = random.uniform(0.05 * original_h, 0.2 * original_h) + cutout_size_w = random.uniform(0.05 * original_w, 0.2 * original_w) + + # Random position for cutout + left = random.uniform(0, original_w - cutout_size_w) + right = left + cutout_size_w + top = random.uniform(0, original_h - cutout_size_h) + bottom = top + cutout_size_h + cutout = np.array([[float(left), float(top), float(right), float(bottom)]]) + + # Calculate intersect between cutout and bounding boxes + overlap_size = intersect(cutout, bbox) + area_boxes = (bbox[:, 2] - bbox[:, 0]) * (bbox[:, 3] - bbox[:, 1]) + ratio = overlap_size / (area_boxes + 1e-14) + # If all boxes have Iou greater than bbox_remove_thres, try again + if ratio.max() > bbox_remove_thres or ratio.max() < bbox_min_thres: + continue + + image[int(top):int(bottom), int(left):int(right), :] = fill_val + count += 1 + if count >= cut_num: + break + return image, bbox + + +def rotate(image: CV_IMAGE, bbox: BBOX, rot: float = 5) -> Tuple[CV_IMAGE, BBOX]: + image = image.copy() + bbox = bbox.copy() + h, w, c = image.shape + center = np.array([w / 2.0, h / 2.0]) + s = max(h, w) * 1.0 + trans = get_affine_transform(center, s, rot, [w, h]) + if len(bbox) > 0: + for i in range(bbox.shape[0]): + x1, y1 = affine_transform(bbox[i, :2], trans) + x2, y2 = affine_transform(bbox[i, 2:], trans) + x3, y3 = affine_transform(bbox[i, [2, 1]], trans) + x4, y4 = affine_transform(bbox[i, [0, 3]], trans) + bbox[i, :2] = [min(x1, x2, x3, x4), min(y1, y2, y3, y4)] + bbox[i, 2:] = [max(x1, x2, x3, x4), max(y1, y2, y3, y4)] + image = cv2.warpAffine(image, trans, (w, h), flags=cv2.INTER_LINEAR) + return image, bbox + + +def get_3rd_point(a: NDArray, b: NDArray) -> NDArray: + direct = a - b + return b + np.array([-direct[1], direct[0]], dtype=np.float32) + + +def get_dir(src_point: NDArray, rot_rad: float) -> List: + sn, cs = np.sin(rot_rad), np.cos(rot_rad) + + src_result = [0, 0] + src_result[0] = src_point[0] * cs - src_point[1] * sn + src_result[1] = src_point[0] * sn + src_point[1] * cs + + return src_result + + +def transform_preds(coords: NDArray, center: NDArray, scale: Any, rot: float, output_size: List) -> NDArray: + trans = get_affine_transform(center, scale, rot, output_size, inv=True) + target_coords = affine_transform(coords, trans) + return target_coords + + +def get_affine_transform(center: NDArray, + scale: Any, + rot: float, + output_size: List, + shift: NDArray = np.array([0, 0], dtype=np.float32), + inv: bool = False) -> NDArray: + if not isinstance(scale, np.ndarray) and not isinstance(scale, list): + scale = np.array([scale, scale], dtype=np.float32) + + scale_tmp = scale + src_w = scale_tmp[0] + dst_w = output_size[0] + dst_h = output_size[1] + + rot_rad = np.pi * rot / 180 + src_dir = get_dir(np.array([0, src_w * -0.5], np.float32), rot_rad) + dst_dir = np.array([0, dst_w * -0.5], np.float32) + + src = np.zeros((3, 2), dtype=np.float32) + dst = np.zeros((3, 2), dtype=np.float32) + src[0, :] = center + scale_tmp * shift + src[1, :] = center + src_dir + scale_tmp * shift + dst[0, :] = [dst_w * 0.5, dst_h * 0.5] + dst[1, :] = np.array([dst_w * 0.5, dst_h * 0.5], np.float32) + dst_dir + + src[2:, :] = get_3rd_point(src[0, :], src[1, :]) + dst[2:, :] = get_3rd_point(dst[0, :], dst[1, :]) + + if inv: + trans = cv2.getAffineTransform(np.float32(dst), np.float32(src)) + else: + trans = cv2.getAffineTransform(np.float32(src), np.float32(dst)) + + return trans + + +def affine_transform(pt: NDArray, t: NDArray) -> NDArray: + new_pt = np.array([pt[0], pt[1], 1.], dtype=np.float32).T + new_pt = np.dot(t, new_pt) + return new_pt[:2] + + +def resize(img: CV_IMAGE, boxes: BBOX, ratio: float = 0.8) -> Tuple[CV_IMAGE, BBOX]: + """ + ratio: <= 1.0 + """ + assert ratio <= 1.0, f'resize ratio {ratio} must <= 1.0' + + h, w, _ = img.shape + ow = int(w * ratio) + oh = int(h * ratio) + resize_img = cv2.resize(img, (ow, oh)) + new_img = np.zeros_like(img) + new_img[:oh, :ow] = resize_img + + if len(boxes) == 0: + return new_img, boxes + else: + return new_img, boxes * ratio + + +def get_ious(boxes1: BBOX, boxes2: BBOX) -> NDArray: + """ + args: + boxes1: np.array, (N, 4), xyxy + boxes2: np.array, (M, 4), xyxy + return: + iou: np.array, (N, M) + """ + area1 = (boxes1[:, 2] - boxes1[:, 0]) * (boxes1[:, 3] - boxes1[:, 1]) + area2 = (boxes2[:, 2] - boxes2[:, 0]) * (boxes2[:, 3] - boxes2[:, 1]) + iner_area = intersect(boxes1, boxes2) + area1 = area1.reshape(-1, 1).repeat(area2.shape[0], axis=1) + area2 = area2.reshape(1, -1).repeat(area1.shape[0], axis=0) + iou = iner_area / (area1 + area2 - iner_area + 1e-14) + return iou + + +def split_result(result: NDArray) -> Tuple[BBOX, NDArray, NDArray]: + if len(result) > 0: + bboxes = result[:, :4].astype(np.int32) + conf = result[:, 4] + class_id = result[:, 5] + else: + bboxes = np.zeros(shape=(0, 4), dtype=np.int32) + conf = np.zeros(shape=(0, 1), dtype=np.float32) + class_id = np.zeros(shape=(0, 1), dtype=np.int32) + + return bboxes, conf, class_id + + +class YmirMining(YmirModel): + + def __init__(self, cfg: edict): + super().__init__(cfg) + if cfg.ymir.run_mining and cfg.ymir.run_infer: + mining_task_idx = 0 + # infer_task_idx = 1 + task_num = 2 + else: + mining_task_idx = 0 + # infer_task_idx = 0 + task_num = 1 + self.task_idx = mining_task_idx + self.task_num = task_num + + def mining(self): + with open(self.cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = len(images) // WORLD_SIZE + if RANK == -1: + N = len(images) + tbar = tqdm(images) + else: + images_rank = images[RANK::WORLD_SIZE] + N = len(images_rank) + if RANK == 0: + tbar = tqdm(images_rank) + else: + tbar = images_rank + + monitor_gap = max(1, N // 100) + idx = -1 + beta = 1.3 + mining_result = [] + for idx, asset_path in enumerate(tbar): + if idx % monitor_gap == 0: + percent = get_ymir_process(stage=YmirStage.TASK, + p=idx / N, + task_idx=self.task_idx, + task_num=self.task_num) + monitor.write_monitor_logger(percent=percent) + # batch-level sync, avoid 30min time-out error + if WORLD_SIZE > 1 and idx < max_barrier_times: + dist.barrier() + + img = cv2.imread(asset_path) + # xyxy,conf,cls + result = self.predict(img) + bboxes, conf, _ = split_result(result) + if len(result) == 0: + # no result for the image without augmentation + mining_result.append((asset_path, -beta)) + continue + + consistency = 0.0 + aug_bboxes_dict, aug_results_dict = self.aug_predict(img, bboxes) + for key in aug_results_dict: + # no result for the image with augmentation f'{key}' + if len(aug_results_dict[key]) == 0: + consistency += beta + continue + + bboxes_key, conf_key, _ = split_result(aug_results_dict[key]) + cls_scores_aug = 1 - conf_key + cls_scores = 1 - conf + + consistency_per_aug = 2.0 + ious = get_ious(bboxes_key, aug_bboxes_dict[key]) + aug_idxs = np.argmax(ious, axis=0) + for origin_idx, aug_idx in enumerate(aug_idxs): + max_iou = ious[aug_idx, origin_idx] + if max_iou == 0: + consistency_per_aug = min(consistency_per_aug, beta) + p = cls_scores_aug[aug_idx] + q = cls_scores[origin_idx] + m = (p + q) / 2. + js = 0.5 * entropy([p, 1 - p], [m, 1 - m]) + 0.5 * entropy([q, 1 - q], [m, 1 - m]) + if js < 0: + js = 0 + consistency_box = max_iou + consistency_cls = 0.5 * \ + (conf[origin_idx] + conf_key[aug_idx]) * (1 - js) + consistency_per_inst = abs(consistency_box + consistency_cls - beta) + consistency_per_aug = min(consistency_per_aug, consistency_per_inst.item()) + + consistency += consistency_per_aug + + consistency /= len(aug_results_dict) + + mining_result.append((asset_path, consistency)) + + if WORLD_SIZE > 1: + mining_result = collect_results_gpu(mining_result, len(images)) + + return mining_result + + def predict(self, img: CV_IMAGE) -> NDArray: + """ + predict single image and return bbox information + img: opencv BGR, uint8 format + """ + results = self.infer(img) + + xyxy_conf_idx_list = [] + for idx, result in enumerate(results): + for line in result: + if any(np.isinf(line)): + continue + x1, y1, x2, y2, score = line + xyxy_conf_idx_list.append([x1, y1, x2, y2, score, idx]) + + if len(xyxy_conf_idx_list) == 0: + return np.zeros(shape=(0, 6), dtype=np.float32) + else: + return np.array(xyxy_conf_idx_list, dtype=np.float32) + + def aug_predict(self, image: CV_IMAGE, bboxes: BBOX) -> Tuple[Dict[str, BBOX], Dict[str, NDArray]]: + """ + for different augmentation methods: flip, cutout, rotate and resize + augment the image and bbox and use model to predict them. + + return the predict result and augment bbox. + """ + aug_dict: Dict[str, Callable] = dict(flip=horizontal_flip, cutout=cutout, rotate=rotate, resize=resize) + + aug_bboxes = dict() + aug_results = dict() + for key in aug_dict: + aug_img, aug_bbox = aug_dict[key](image, bboxes) + + aug_result = self.predict(aug_img) + aug_bboxes[key] = aug_bbox + aug_results[key] = aug_result + + return aug_bboxes, aug_results + + +def main(): + if LOCAL_RANK != -1: + init_dist(launcher='pytorch', backend="nccl" if dist.is_nccl_available() else "gloo") + + cfg = get_merged_config() + miner = YmirMining(cfg) + gpu = max(0, LOCAL_RANK) + device = torch.device('cuda', gpu) + miner.model.to(device) + mining_result = miner.mining() + + if RANK in [0, -1]: + rw.write_mining_result(mining_result=mining_result) + + percent = get_ymir_process(stage=YmirStage.POSTPROCESS, p=1, task_idx=miner.task_idx, task_num=miner.task_num) + monitor.write_monitor_logger(percent=percent) + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/det-mmdetection-tmi/ymir_mining_entropy.py b/det-mmdetection-tmi/ymir_mining_entropy.py new file mode 100644 index 0000000..02426b2 --- /dev/null +++ b/det-mmdetection-tmi/ymir_mining_entropy.py @@ -0,0 +1,150 @@ +""" +data augmentations for CALD method, including horizontal_flip, rotate(5'), cutout +official code: https://github.com/we1pingyu/CALD/blob/master/cald/cald_helper.py +""" +import os +import random +import sys +from typing import Any, Callable, Dict, List, Tuple + +import cv2 +import numpy as np +import torch +import torch.distributed as dist +from easydict import EasyDict as edict +from mmcv.runner import init_dist +from mmdet.apis.test import collect_results_gpu +from mmdet.utils.util_ymir import BBOX, CV_IMAGE +from nptyping import NDArray +from scipy.stats import entropy +from tqdm import tqdm +from ymir_exc import monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config, get_ymir_process +from ymir_infer import YmirModel + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + + +def split_result(result: NDArray) -> Tuple[BBOX, NDArray, NDArray]: + if len(result) > 0: + bboxes = result[:, :4].astype(np.int32) + conf = result[:, 4] + class_id = result[:, 5] + else: + bboxes = np.zeros(shape=(0, 4), dtype=np.int32) + conf = np.zeros(shape=(0, 1), dtype=np.float32) + class_id = np.zeros(shape=(0, 1), dtype=np.int32) + + return bboxes, conf, class_id + + +class YmirMining(YmirModel): + + def __init__(self, cfg: edict): + super().__init__(cfg) + if cfg.ymir.run_mining and cfg.ymir.run_infer: + mining_task_idx = 0 + # infer_task_idx = 1 + task_num = 2 + else: + mining_task_idx = 0 + # infer_task_idx = 0 + task_num = 1 + self.task_idx = mining_task_idx + self.task_num = task_num + + def mining(self): + with open(self.cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = len(images) // WORLD_SIZE + if RANK == -1: + N = len(images) + tbar = tqdm(images) + else: + images_rank = images[RANK::WORLD_SIZE] + N = len(images_rank) + if RANK == 0: + tbar = tqdm(images_rank) + else: + tbar = images_rank + + monitor_gap = max(1, N // 100) + idx = -1 + + mining_result = [] + for idx, asset_path in enumerate(tbar): + if idx % monitor_gap == 0: + percent = get_ymir_process(stage=YmirStage.TASK, + p=idx / N, + task_idx=self.task_idx, + task_num=self.task_num) + monitor.write_monitor_logger(percent=percent) + # batch-level sync, avoid 30min time-out error + if WORLD_SIZE > 1 and idx < max_barrier_times: + dist.barrier() + + img = cv2.imread(asset_path) + # xyxy,conf,cls + result = self.predict(img) + bboxes, conf, _ = split_result(result) + if len(result) == 0: + # no result for the image without augmentation + mining_result.append((asset_path, -10)) + continue + conf = conf.data.cpu().numpy() + mining_result.append((asset_path, -np.sum(conf * np.log2(conf)))) + + if WORLD_SIZE > 1: + mining_result = collect_results_gpu(mining_result, len(images)) + + return mining_result + + def predict(self, img: CV_IMAGE) -> NDArray: + """ + predict single image and return bbox information + img: opencv BGR, uint8 format + """ + results = self.infer(img) + + xyxy_conf_idx_list = [] + for idx, result in enumerate(results): + for line in result: + if any(np.isinf(line)): + continue + x1, y1, x2, y2, score = line + xyxy_conf_idx_list.append([x1, y1, x2, y2, score, idx]) + + if len(xyxy_conf_idx_list) == 0: + return np.zeros(shape=(0, 6), dtype=np.float32) + else: + return np.array(xyxy_conf_idx_list, dtype=np.float32) + + + +def main(): + if LOCAL_RANK != -1: + init_dist(launcher='pytorch', backend="nccl" if dist.is_nccl_available() else "gloo") + + cfg = get_merged_config() + miner = YmirMining(cfg) + gpu = max(0, LOCAL_RANK) + device = torch.device('cuda', gpu) + miner.model.to(device) + mining_result = miner.mining() + + if RANK in [0, -1]: + rw.write_mining_result(mining_result=mining_result) + + percent = get_ymir_process(stage=YmirStage.POSTPROCESS, p=1, task_idx=miner.task_idx, task_num=miner.task_num) + monitor.write_monitor_logger(percent=percent) + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/det-mmdetection-tmi/ymir_mining_random.py b/det-mmdetection-tmi/ymir_mining_random.py new file mode 100644 index 0000000..0bb5afb --- /dev/null +++ b/det-mmdetection-tmi/ymir_mining_random.py @@ -0,0 +1,87 @@ +import os +import random +import sys + +import torch +import torch.distributed as dist +from easydict import EasyDict as edict +from mmcv.runner import init_dist +from mmdet.apis.test import collect_results_gpu +from tqdm import tqdm +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config, write_ymir_monitor_process + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +class RandomMiner(object): + + def __init__(self, cfg: edict): + if LOCAL_RANK != -1: + init_dist(launcher='pytorch', backend="nccl" if dist.is_nccl_available() else "gloo") + + self.cfg = cfg + gpu = max(0, LOCAL_RANK) + self.device = f'cuda:{gpu}' + + def mining(self): + with open(self.cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = len(images) // WORLD_SIZE + if RANK == -1: + N = len(images) + tbar = tqdm(images) + else: + images_rank = images[RANK::WORLD_SIZE] + N = len(images_rank) + if RANK == 0: + tbar = tqdm(images_rank) + else: + tbar = images_rank + + monitor_gap = max(1, N // 100) + + mining_result = [] + for idx, asset_path in enumerate(tbar): + if idx % monitor_gap == 0: + write_ymir_monitor_process(cfg=self.cfg, + task='mining', + naive_stage_percent=idx / N, + stage=YmirStage.TASK, + task_order='tmi') + + if WORLD_SIZE > 1 and idx < max_barrier_times: + dist.barrier() + + with torch.no_grad(): + consistency = self.compute_score(asset_path=asset_path) + mining_result.append((asset_path, consistency)) + + if WORLD_SIZE > 1: + mining_result = collect_results_gpu(mining_result, len(images)) + + if RANK in [0, -1]: + rw.write_mining_result(mining_result=mining_result) + write_ymir_monitor_process(cfg=self.cfg, + task='mining', + naive_stage_percent=1, + stage=YmirStage.POSTPROCESS, + task_order='tmi') + return mining_result + + def compute_score(self, asset_path: str) -> float: + return random.random() + + +def main(): + cfg = get_merged_config() + miner = RandomMiner(cfg) + miner.mining() + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/det-mmdetection-tmi/ymir_train.py b/det-mmdetection-tmi/ymir_train.py new file mode 100644 index 0000000..06ed4dd --- /dev/null +++ b/det-mmdetection-tmi/ymir_train.py @@ -0,0 +1,83 @@ +import logging +import os +import os.path as osp +import subprocess +import sys + +from easydict import EasyDict as edict +from mmdet.utils.util_ymir import get_best_weight_file, write_ymir_training_result +from ymir_exc import monitor +from ymir_exc.util import YmirStage, find_free_port, get_merged_config, get_ymir_process + + +def main(cfg: edict) -> int: + # default ymir config + gpu_id: str = str(cfg.param.get("gpu_id", '0')) + num_gpus = len(gpu_id.split(",")) + if num_gpus == 0: + raise Exception(f'gpu_id = {gpu_id} is not valid, eg: 0 or 2,4') + + classes = cfg.param.class_names + num_classes = len(classes) + if num_classes == 0: + raise Exception('not find class_names in config!') + + # mmcv args config + config_file = cfg.param.get("config_file") + args_options = cfg.param.get("args_options", None) + cfg_options = cfg.param.get("cfg_options", None) + + # auto load offered weight file if not set by user! + if (args_options is None or args_options.find('--resume-from') == -1) and \ + (cfg_options is None or (cfg_options.find('load_from') == -1 and + cfg_options.find('resume_from') == -1)): + + weight_file = get_best_weight_file(cfg) + if weight_file: + if cfg_options: + cfg_options += f' load_from={weight_file}' + else: + cfg_options = f'load_from={weight_file}' + else: + logging.warning('no weight file used for training!') + + monitor.write_monitor_logger( + percent=get_ymir_process(YmirStage.PREPROCESS, p=0.2)) + + work_dir = cfg.ymir.output.models_dir + if num_gpus == 0: + # view https://mmdetection.readthedocs.io/en/stable/1_exist_data_model.html#training-on-cpu + os.environ.setdefault('CUDA_VISIBLE_DEVICES', "-1") + cmd = f"python3 tools/train.py {config_file} " + \ + f"--work-dir {work_dir}" + elif num_gpus == 1: + cmd = f"python3 tools/train.py {config_file} " + \ + f"--work-dir {work_dir} --gpu-id {gpu_id}" + else: + os.environ.setdefault('CUDA_VISIBLE_DEVICES', gpu_id) + port = find_free_port() + os.environ.setdefault('PORT', str(port)) + cmd = f"bash ./tools/dist_train.sh {config_file} {num_gpus} " + \ + f"--work-dir {work_dir}" + + if args_options: + cmd += f" {args_options}" + + if cfg_options: + cmd += f" --cfg-options {cfg_options}" + + logging.info(f"training command: {cmd}") + subprocess.run(cmd.split(), check=True) + + # save the last checkpoint + write_ymir_training_result(last=True) + return 0 + + +if __name__ == '__main__': + cfg = get_merged_config() + os.environ.setdefault('YMIR_MODELS_DIR', cfg.ymir.output.models_dir) + os.environ.setdefault('COCO_EVAL_TMP_FILE', osp.join( + cfg.ymir.output.root_dir, 'eval_tmp.json')) + os.environ.setdefault('PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION', 'python') + sys.exit(main(cfg)) diff --git a/det-yolov4-mining/Dockerfile b/det-yolov4-mining/Dockerfile deleted file mode 100644 index 4305760..0000000 --- a/det-yolov4-mining/Dockerfile +++ /dev/null @@ -1,20 +0,0 @@ -FROM industryessentials/mxnet_python:1.5.0_gpu_cu101mkl_py3_ub18 - -RUN sed -i '/developer\.download\.nvidia\.com\/compute\/cuda\/repos/d' /etc/apt/sources.list.d/* \ - && sed -i '/developer\.download\.nvidia\.com\/compute\/machine-learning\/repos/d' /etc/apt/sources.list.d/* \ - && apt-key del 7fa2af80 \ - && wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb \ - && dpkg -i cuda-keyring_1.0-1_all.deb -RUN apt-get update && apt-get install -y --no-install-recommends libsm6 libxext6 libfontconfig1 libxrender1 libgl1-mesa-glx \ - && apt-get clean && rm -rf /var/lib/apt/lists/* - -RUN pip3 install --upgrade pip setuptools wheel && pip3 install opencv-python pyyaml scipy tqdm && rm -rf /root/.cache/pip3 - -COPY . /app -WORKDIR /app -RUN cp ./start.sh /usr/bin/start.sh && \ - mkdir -p /img-man && \ - cp ./mining-template.yaml /img-man/mining-template.yaml && \ - cp ./infer-template.yaml /img-man/infer-template.yaml && \ - cp ./README.md /img-man/readme.md -CMD sh /usr/bin/start.sh diff --git a/det-yolov4-mining/cuda112.dockerfile b/det-yolov4-mining/cuda112.dockerfile deleted file mode 100644 index 871b00f..0000000 --- a/det-yolov4-mining/cuda112.dockerfile +++ /dev/null @@ -1,15 +0,0 @@ -FROM industryessentials/ymir-executor:cuda112-yolov4-training - -RUN apt-get update && apt-get install -y --no-install-recommends libsm6 libxext6 libfontconfig1 libxrender1 libgl1-mesa-glx \ - && apt-get clean && rm -rf /var/lib/apt/lists/* - -RUN pip3 install --upgrade pip setuptools wheel && pip3 install opencv-python pyyaml scipy tqdm && rm -rf /root/.cache/pip3 - -COPY . /app -WORKDIR /app -RUN cp ./start.sh /usr/bin/start.sh && \ - mkdir -p /img-man && \ - cp ./mining-template.yaml /img-man/mining-template.yaml && \ - cp ./infer-template.yaml /img-man/infer-template.yaml && \ - cp ./README.md /img-man/readme.md -CMD sh /usr/bin/start.sh diff --git a/det-yolov4-training/.circleci/config.yml b/det-yolov4-tmi/.circleci/config.yml similarity index 100% rename from det-yolov4-training/.circleci/config.yml rename to det-yolov4-tmi/.circleci/config.yml diff --git a/det-yolov4-training/.travis.yml b/det-yolov4-tmi/.travis.yml similarity index 100% rename from det-yolov4-training/.travis.yml rename to det-yolov4-tmi/.travis.yml diff --git a/det-yolov4-training/3rdparty/pthreads/bin/pthreadGC2.dll b/det-yolov4-tmi/3rdparty/pthreads/bin/pthreadGC2.dll similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/bin/pthreadGC2.dll rename to det-yolov4-tmi/3rdparty/pthreads/bin/pthreadGC2.dll diff --git a/det-yolov4-training/3rdparty/pthreads/bin/pthreadVC2.dll b/det-yolov4-tmi/3rdparty/pthreads/bin/pthreadVC2.dll similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/bin/pthreadVC2.dll rename to det-yolov4-tmi/3rdparty/pthreads/bin/pthreadVC2.dll diff --git a/det-yolov4-training/3rdparty/pthreads/include/pthread.h b/det-yolov4-tmi/3rdparty/pthreads/include/pthread.h similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/include/pthread.h rename to det-yolov4-tmi/3rdparty/pthreads/include/pthread.h diff --git a/det-yolov4-training/3rdparty/pthreads/include/sched.h b/det-yolov4-tmi/3rdparty/pthreads/include/sched.h similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/include/sched.h rename to det-yolov4-tmi/3rdparty/pthreads/include/sched.h diff --git a/det-yolov4-training/3rdparty/pthreads/include/semaphore.h b/det-yolov4-tmi/3rdparty/pthreads/include/semaphore.h similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/include/semaphore.h rename to det-yolov4-tmi/3rdparty/pthreads/include/semaphore.h diff --git a/det-yolov4-training/3rdparty/pthreads/lib/libpthreadGC2.a b/det-yolov4-tmi/3rdparty/pthreads/lib/libpthreadGC2.a similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/lib/libpthreadGC2.a rename to det-yolov4-tmi/3rdparty/pthreads/lib/libpthreadGC2.a diff --git a/det-yolov4-training/3rdparty/pthreads/lib/pthreadVC2.lib b/det-yolov4-tmi/3rdparty/pthreads/lib/pthreadVC2.lib similarity index 100% rename from det-yolov4-training/3rdparty/pthreads/lib/pthreadVC2.lib rename to det-yolov4-tmi/3rdparty/pthreads/lib/pthreadVC2.lib diff --git a/det-yolov4-training/3rdparty/stb/include/stb_image.h b/det-yolov4-tmi/3rdparty/stb/include/stb_image.h similarity index 100% rename from det-yolov4-training/3rdparty/stb/include/stb_image.h rename to det-yolov4-tmi/3rdparty/stb/include/stb_image.h diff --git a/det-yolov4-training/3rdparty/stb/include/stb_image_write.h b/det-yolov4-tmi/3rdparty/stb/include/stb_image_write.h similarity index 100% rename from det-yolov4-training/3rdparty/stb/include/stb_image_write.h rename to det-yolov4-tmi/3rdparty/stb/include/stb_image_write.h diff --git a/det-yolov4-training/CMakeLists.txt b/det-yolov4-tmi/CMakeLists.txt similarity index 100% rename from det-yolov4-training/CMakeLists.txt rename to det-yolov4-tmi/CMakeLists.txt diff --git a/det-yolov4-training/DarknetConfig.cmake.in b/det-yolov4-tmi/DarknetConfig.cmake.in similarity index 100% rename from det-yolov4-training/DarknetConfig.cmake.in rename to det-yolov4-tmi/DarknetConfig.cmake.in diff --git a/det-yolov4-training/LICENSE b/det-yolov4-tmi/LICENSE similarity index 100% rename from det-yolov4-training/LICENSE rename to det-yolov4-tmi/LICENSE diff --git a/det-yolov4-training/Makefile b/det-yolov4-tmi/Makefile similarity index 100% rename from det-yolov4-training/Makefile rename to det-yolov4-tmi/Makefile diff --git a/det-yolov4-training/README.md b/det-yolov4-tmi/README.md similarity index 100% rename from det-yolov4-training/README.md rename to det-yolov4-tmi/README.md diff --git a/det-yolov4-training/build.ps1 b/det-yolov4-tmi/build.ps1 similarity index 100% rename from det-yolov4-training/build.ps1 rename to det-yolov4-tmi/build.ps1 diff --git a/det-yolov4-training/calc_map.sh b/det-yolov4-tmi/calc_map.sh similarity index 100% rename from det-yolov4-training/calc_map.sh rename to det-yolov4-tmi/calc_map.sh diff --git a/det-yolov4-training/cfg/9k.labels b/det-yolov4-tmi/cfg/9k.labels similarity index 100% rename from det-yolov4-training/cfg/9k.labels rename to det-yolov4-tmi/cfg/9k.labels diff --git a/det-yolov4-training/cfg/9k.names b/det-yolov4-tmi/cfg/9k.names similarity index 100% rename from det-yolov4-training/cfg/9k.names rename to det-yolov4-tmi/cfg/9k.names diff --git a/det-yolov4-training/cfg/9k.tree b/det-yolov4-tmi/cfg/9k.tree similarity index 100% rename from det-yolov4-training/cfg/9k.tree rename to det-yolov4-tmi/cfg/9k.tree diff --git a/det-yolov4-training/cfg/Gaussian_yolov3_BDD.cfg b/det-yolov4-tmi/cfg/Gaussian_yolov3_BDD.cfg similarity index 100% rename from det-yolov4-training/cfg/Gaussian_yolov3_BDD.cfg rename to det-yolov4-tmi/cfg/Gaussian_yolov3_BDD.cfg diff --git a/det-yolov4-training/cfg/alexnet.cfg b/det-yolov4-tmi/cfg/alexnet.cfg similarity index 100% rename from det-yolov4-training/cfg/alexnet.cfg rename to det-yolov4-tmi/cfg/alexnet.cfg diff --git a/det-yolov4-training/cfg/cd53paspp-gamma.cfg b/det-yolov4-tmi/cfg/cd53paspp-gamma.cfg similarity index 100% rename from det-yolov4-training/cfg/cd53paspp-gamma.cfg rename to det-yolov4-tmi/cfg/cd53paspp-gamma.cfg diff --git a/det-yolov4-training/cfg/cifar.cfg b/det-yolov4-tmi/cfg/cifar.cfg similarity index 100% rename from det-yolov4-training/cfg/cifar.cfg rename to det-yolov4-tmi/cfg/cifar.cfg diff --git a/det-yolov4-training/cfg/cifar.test.cfg b/det-yolov4-tmi/cfg/cifar.test.cfg similarity index 100% rename from det-yolov4-training/cfg/cifar.test.cfg rename to det-yolov4-tmi/cfg/cifar.test.cfg diff --git a/det-yolov4-training/cfg/coco.data b/det-yolov4-tmi/cfg/coco.data similarity index 100% rename from det-yolov4-training/cfg/coco.data rename to det-yolov4-tmi/cfg/coco.data diff --git a/det-yolov4-training/cfg/coco.names b/det-yolov4-tmi/cfg/coco.names similarity index 100% rename from det-yolov4-training/cfg/coco.names rename to det-yolov4-tmi/cfg/coco.names diff --git a/det-yolov4-training/cfg/coco9k.map b/det-yolov4-tmi/cfg/coco9k.map similarity index 100% rename from det-yolov4-training/cfg/coco9k.map rename to det-yolov4-tmi/cfg/coco9k.map diff --git a/det-yolov4-training/cfg/combine9k.data b/det-yolov4-tmi/cfg/combine9k.data similarity index 100% rename from det-yolov4-training/cfg/combine9k.data rename to det-yolov4-tmi/cfg/combine9k.data diff --git a/det-yolov4-training/cfg/crnn.train.cfg b/det-yolov4-tmi/cfg/crnn.train.cfg similarity index 100% rename from det-yolov4-training/cfg/crnn.train.cfg rename to det-yolov4-tmi/cfg/crnn.train.cfg diff --git a/det-yolov4-training/cfg/csdarknet53-omega.cfg b/det-yolov4-tmi/cfg/csdarknet53-omega.cfg similarity index 100% rename from det-yolov4-training/cfg/csdarknet53-omega.cfg rename to det-yolov4-tmi/cfg/csdarknet53-omega.cfg diff --git a/det-yolov4-training/cfg/cspx-p7-mish-omega.cfg b/det-yolov4-tmi/cfg/cspx-p7-mish-omega.cfg similarity index 100% rename from det-yolov4-training/cfg/cspx-p7-mish-omega.cfg rename to det-yolov4-tmi/cfg/cspx-p7-mish-omega.cfg diff --git a/det-yolov4-training/cfg/cspx-p7-mish.cfg b/det-yolov4-tmi/cfg/cspx-p7-mish.cfg similarity index 100% rename from det-yolov4-training/cfg/cspx-p7-mish.cfg rename to det-yolov4-tmi/cfg/cspx-p7-mish.cfg diff --git a/det-yolov4-training/cfg/cspx-p7-mish_hp.cfg b/det-yolov4-tmi/cfg/cspx-p7-mish_hp.cfg similarity index 100% rename from det-yolov4-training/cfg/cspx-p7-mish_hp.cfg rename to det-yolov4-tmi/cfg/cspx-p7-mish_hp.cfg diff --git a/det-yolov4-training/cfg/csresnext50-panet-spp-original-optimal.cfg b/det-yolov4-tmi/cfg/csresnext50-panet-spp-original-optimal.cfg similarity index 100% rename from det-yolov4-training/cfg/csresnext50-panet-spp-original-optimal.cfg rename to det-yolov4-tmi/cfg/csresnext50-panet-spp-original-optimal.cfg diff --git a/det-yolov4-training/cfg/csresnext50-panet-spp.cfg b/det-yolov4-tmi/cfg/csresnext50-panet-spp.cfg similarity index 100% rename from det-yolov4-training/cfg/csresnext50-panet-spp.cfg rename to det-yolov4-tmi/cfg/csresnext50-panet-spp.cfg diff --git a/det-yolov4-training/cfg/darknet.cfg b/det-yolov4-tmi/cfg/darknet.cfg similarity index 100% rename from det-yolov4-training/cfg/darknet.cfg rename to det-yolov4-tmi/cfg/darknet.cfg diff --git a/det-yolov4-training/cfg/darknet19.cfg b/det-yolov4-tmi/cfg/darknet19.cfg similarity index 100% rename from det-yolov4-training/cfg/darknet19.cfg rename to det-yolov4-tmi/cfg/darknet19.cfg diff --git a/det-yolov4-training/cfg/darknet19_448.cfg b/det-yolov4-tmi/cfg/darknet19_448.cfg similarity index 100% rename from det-yolov4-training/cfg/darknet19_448.cfg rename to det-yolov4-tmi/cfg/darknet19_448.cfg diff --git a/det-yolov4-training/cfg/darknet53.cfg b/det-yolov4-tmi/cfg/darknet53.cfg similarity index 100% rename from det-yolov4-training/cfg/darknet53.cfg rename to det-yolov4-tmi/cfg/darknet53.cfg diff --git a/det-yolov4-training/cfg/darknet53_448_xnor.cfg b/det-yolov4-tmi/cfg/darknet53_448_xnor.cfg similarity index 100% rename from det-yolov4-training/cfg/darknet53_448_xnor.cfg rename to det-yolov4-tmi/cfg/darknet53_448_xnor.cfg diff --git a/det-yolov4-training/cfg/densenet201.cfg b/det-yolov4-tmi/cfg/densenet201.cfg similarity index 100% rename from det-yolov4-training/cfg/densenet201.cfg rename to det-yolov4-tmi/cfg/densenet201.cfg diff --git a/det-yolov4-training/cfg/efficientnet-lite3.cfg b/det-yolov4-tmi/cfg/efficientnet-lite3.cfg similarity index 100% rename from det-yolov4-training/cfg/efficientnet-lite3.cfg rename to det-yolov4-tmi/cfg/efficientnet-lite3.cfg diff --git a/det-yolov4-training/cfg/efficientnet_b0.cfg b/det-yolov4-tmi/cfg/efficientnet_b0.cfg similarity index 100% rename from det-yolov4-training/cfg/efficientnet_b0.cfg rename to det-yolov4-tmi/cfg/efficientnet_b0.cfg diff --git a/det-yolov4-training/cfg/enet-coco.cfg b/det-yolov4-tmi/cfg/enet-coco.cfg similarity index 100% rename from det-yolov4-training/cfg/enet-coco.cfg rename to det-yolov4-tmi/cfg/enet-coco.cfg diff --git a/det-yolov4-training/cfg/extraction.cfg b/det-yolov4-tmi/cfg/extraction.cfg similarity index 100% rename from det-yolov4-training/cfg/extraction.cfg rename to det-yolov4-tmi/cfg/extraction.cfg diff --git a/det-yolov4-training/cfg/extraction.conv.cfg b/det-yolov4-tmi/cfg/extraction.conv.cfg similarity index 100% rename from det-yolov4-training/cfg/extraction.conv.cfg rename to det-yolov4-tmi/cfg/extraction.conv.cfg diff --git a/det-yolov4-training/cfg/extraction22k.cfg b/det-yolov4-tmi/cfg/extraction22k.cfg similarity index 100% rename from det-yolov4-training/cfg/extraction22k.cfg rename to det-yolov4-tmi/cfg/extraction22k.cfg diff --git a/det-yolov4-training/cfg/go.test.cfg b/det-yolov4-tmi/cfg/go.test.cfg similarity index 100% rename from det-yolov4-training/cfg/go.test.cfg rename to det-yolov4-tmi/cfg/go.test.cfg diff --git a/det-yolov4-training/cfg/gru.cfg b/det-yolov4-tmi/cfg/gru.cfg similarity index 100% rename from det-yolov4-training/cfg/gru.cfg rename to det-yolov4-tmi/cfg/gru.cfg diff --git a/det-yolov4-training/cfg/imagenet.labels.list b/det-yolov4-tmi/cfg/imagenet.labels.list similarity index 100% rename from det-yolov4-training/cfg/imagenet.labels.list rename to det-yolov4-tmi/cfg/imagenet.labels.list diff --git a/det-yolov4-training/cfg/imagenet.shortnames.list b/det-yolov4-tmi/cfg/imagenet.shortnames.list similarity index 100% rename from det-yolov4-training/cfg/imagenet.shortnames.list rename to det-yolov4-tmi/cfg/imagenet.shortnames.list diff --git a/det-yolov4-training/cfg/imagenet1k.data b/det-yolov4-tmi/cfg/imagenet1k.data similarity index 100% rename from det-yolov4-training/cfg/imagenet1k.data rename to det-yolov4-tmi/cfg/imagenet1k.data diff --git a/det-yolov4-training/cfg/imagenet22k.dataset b/det-yolov4-tmi/cfg/imagenet22k.dataset similarity index 100% rename from det-yolov4-training/cfg/imagenet22k.dataset rename to det-yolov4-tmi/cfg/imagenet22k.dataset diff --git a/det-yolov4-training/cfg/imagenet9k.hierarchy.dataset b/det-yolov4-tmi/cfg/imagenet9k.hierarchy.dataset similarity index 100% rename from det-yolov4-training/cfg/imagenet9k.hierarchy.dataset rename to det-yolov4-tmi/cfg/imagenet9k.hierarchy.dataset diff --git a/det-yolov4-training/cfg/inet9k.map b/det-yolov4-tmi/cfg/inet9k.map similarity index 100% rename from det-yolov4-training/cfg/inet9k.map rename to det-yolov4-tmi/cfg/inet9k.map diff --git a/det-yolov4-training/cfg/jnet-conv.cfg b/det-yolov4-tmi/cfg/jnet-conv.cfg similarity index 100% rename from det-yolov4-training/cfg/jnet-conv.cfg rename to det-yolov4-tmi/cfg/jnet-conv.cfg diff --git a/det-yolov4-training/cfg/lstm.train.cfg b/det-yolov4-tmi/cfg/lstm.train.cfg similarity index 100% rename from det-yolov4-training/cfg/lstm.train.cfg rename to det-yolov4-tmi/cfg/lstm.train.cfg diff --git a/det-yolov4-training/cfg/openimages.data b/det-yolov4-tmi/cfg/openimages.data similarity index 100% rename from det-yolov4-training/cfg/openimages.data rename to det-yolov4-tmi/cfg/openimages.data diff --git a/det-yolov4-training/cfg/resnet101.cfg b/det-yolov4-tmi/cfg/resnet101.cfg similarity index 100% rename from det-yolov4-training/cfg/resnet101.cfg rename to det-yolov4-tmi/cfg/resnet101.cfg diff --git a/det-yolov4-training/cfg/resnet152.cfg b/det-yolov4-tmi/cfg/resnet152.cfg similarity index 100% rename from det-yolov4-training/cfg/resnet152.cfg rename to det-yolov4-tmi/cfg/resnet152.cfg diff --git a/det-yolov4-training/cfg/resnet152_trident.cfg b/det-yolov4-tmi/cfg/resnet152_trident.cfg similarity index 100% rename from det-yolov4-training/cfg/resnet152_trident.cfg rename to det-yolov4-tmi/cfg/resnet152_trident.cfg diff --git a/det-yolov4-training/cfg/resnet50.cfg b/det-yolov4-tmi/cfg/resnet50.cfg similarity index 100% rename from det-yolov4-training/cfg/resnet50.cfg rename to det-yolov4-tmi/cfg/resnet50.cfg diff --git a/det-yolov4-training/cfg/resnext152-32x4d.cfg b/det-yolov4-tmi/cfg/resnext152-32x4d.cfg similarity index 100% rename from det-yolov4-training/cfg/resnext152-32x4d.cfg rename to det-yolov4-tmi/cfg/resnext152-32x4d.cfg diff --git a/det-yolov4-training/cfg/rnn.cfg b/det-yolov4-tmi/cfg/rnn.cfg similarity index 100% rename from det-yolov4-training/cfg/rnn.cfg rename to det-yolov4-tmi/cfg/rnn.cfg diff --git a/det-yolov4-training/cfg/rnn.train.cfg b/det-yolov4-tmi/cfg/rnn.train.cfg similarity index 100% rename from det-yolov4-training/cfg/rnn.train.cfg rename to det-yolov4-tmi/cfg/rnn.train.cfg diff --git a/det-yolov4-training/cfg/strided.cfg b/det-yolov4-tmi/cfg/strided.cfg similarity index 100% rename from det-yolov4-training/cfg/strided.cfg rename to det-yolov4-tmi/cfg/strided.cfg diff --git a/det-yolov4-training/cfg/t1.test.cfg b/det-yolov4-tmi/cfg/t1.test.cfg similarity index 100% rename from det-yolov4-training/cfg/t1.test.cfg rename to det-yolov4-tmi/cfg/t1.test.cfg diff --git a/det-yolov4-training/cfg/tiny-yolo-voc.cfg b/det-yolov4-tmi/cfg/tiny-yolo-voc.cfg similarity index 100% rename from det-yolov4-training/cfg/tiny-yolo-voc.cfg rename to det-yolov4-tmi/cfg/tiny-yolo-voc.cfg diff --git a/det-yolov4-training/cfg/tiny-yolo.cfg b/det-yolov4-tmi/cfg/tiny-yolo.cfg similarity index 100% rename from det-yolov4-training/cfg/tiny-yolo.cfg rename to det-yolov4-tmi/cfg/tiny-yolo.cfg diff --git a/det-yolov4-training/cfg/tiny-yolo_xnor.cfg b/det-yolov4-tmi/cfg/tiny-yolo_xnor.cfg similarity index 100% rename from det-yolov4-training/cfg/tiny-yolo_xnor.cfg rename to det-yolov4-tmi/cfg/tiny-yolo_xnor.cfg diff --git a/det-yolov4-training/cfg/tiny.cfg b/det-yolov4-tmi/cfg/tiny.cfg similarity index 100% rename from det-yolov4-training/cfg/tiny.cfg rename to det-yolov4-tmi/cfg/tiny.cfg diff --git a/det-yolov4-training/cfg/vgg-16.cfg b/det-yolov4-tmi/cfg/vgg-16.cfg similarity index 100% rename from det-yolov4-training/cfg/vgg-16.cfg rename to det-yolov4-tmi/cfg/vgg-16.cfg diff --git a/det-yolov4-training/cfg/vgg-conv.cfg b/det-yolov4-tmi/cfg/vgg-conv.cfg similarity index 100% rename from det-yolov4-training/cfg/vgg-conv.cfg rename to det-yolov4-tmi/cfg/vgg-conv.cfg diff --git a/det-yolov4-training/cfg/voc.data b/det-yolov4-tmi/cfg/voc.data similarity index 100% rename from det-yolov4-training/cfg/voc.data rename to det-yolov4-tmi/cfg/voc.data diff --git a/det-yolov4-training/cfg/writing.cfg b/det-yolov4-tmi/cfg/writing.cfg similarity index 100% rename from det-yolov4-training/cfg/writing.cfg rename to det-yolov4-tmi/cfg/writing.cfg diff --git a/det-yolov4-training/cfg/yolo-voc.2.0.cfg b/det-yolov4-tmi/cfg/yolo-voc.2.0.cfg similarity index 100% rename from det-yolov4-training/cfg/yolo-voc.2.0.cfg rename to det-yolov4-tmi/cfg/yolo-voc.2.0.cfg diff --git a/det-yolov4-training/cfg/yolo-voc.cfg b/det-yolov4-tmi/cfg/yolo-voc.cfg similarity index 100% rename from det-yolov4-training/cfg/yolo-voc.cfg rename to det-yolov4-tmi/cfg/yolo-voc.cfg diff --git a/det-yolov4-training/cfg/yolo.2.0.cfg b/det-yolov4-tmi/cfg/yolo.2.0.cfg similarity index 100% rename from det-yolov4-training/cfg/yolo.2.0.cfg rename to det-yolov4-tmi/cfg/yolo.2.0.cfg diff --git a/det-yolov4-training/cfg/yolo.cfg b/det-yolov4-tmi/cfg/yolo.cfg similarity index 100% rename from det-yolov4-training/cfg/yolo.cfg rename to det-yolov4-tmi/cfg/yolo.cfg diff --git a/det-yolov4-training/cfg/yolo9000.cfg b/det-yolov4-tmi/cfg/yolo9000.cfg similarity index 100% rename from det-yolov4-training/cfg/yolo9000.cfg rename to det-yolov4-tmi/cfg/yolo9000.cfg diff --git a/det-yolov4-training/cfg/yolov1/tiny-coco.cfg b/det-yolov4-tmi/cfg/yolov1/tiny-coco.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/tiny-coco.cfg rename to det-yolov4-tmi/cfg/yolov1/tiny-coco.cfg diff --git a/det-yolov4-training/cfg/yolov1/tiny-yolo.cfg b/det-yolov4-tmi/cfg/yolov1/tiny-yolo.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/tiny-yolo.cfg rename to det-yolov4-tmi/cfg/yolov1/tiny-yolo.cfg diff --git a/det-yolov4-training/cfg/yolov1/xyolo.test.cfg b/det-yolov4-tmi/cfg/yolov1/xyolo.test.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/xyolo.test.cfg rename to det-yolov4-tmi/cfg/yolov1/xyolo.test.cfg diff --git a/det-yolov4-training/cfg/yolov1/yolo-coco.cfg b/det-yolov4-tmi/cfg/yolov1/yolo-coco.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/yolo-coco.cfg rename to det-yolov4-tmi/cfg/yolov1/yolo-coco.cfg diff --git a/det-yolov4-training/cfg/yolov1/yolo-small.cfg b/det-yolov4-tmi/cfg/yolov1/yolo-small.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/yolo-small.cfg rename to det-yolov4-tmi/cfg/yolov1/yolo-small.cfg diff --git a/det-yolov4-training/cfg/yolov1/yolo.cfg b/det-yolov4-tmi/cfg/yolov1/yolo.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/yolo.cfg rename to det-yolov4-tmi/cfg/yolov1/yolo.cfg diff --git a/det-yolov4-training/cfg/yolov1/yolo.train.cfg b/det-yolov4-tmi/cfg/yolov1/yolo.train.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/yolo.train.cfg rename to det-yolov4-tmi/cfg/yolov1/yolo.train.cfg diff --git a/det-yolov4-training/cfg/yolov1/yolo2.cfg b/det-yolov4-tmi/cfg/yolov1/yolo2.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov1/yolo2.cfg rename to det-yolov4-tmi/cfg/yolov1/yolo2.cfg diff --git a/det-yolov4-training/cfg/yolov2-tiny-voc.cfg b/det-yolov4-tmi/cfg/yolov2-tiny-voc.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov2-tiny-voc.cfg rename to det-yolov4-tmi/cfg/yolov2-tiny-voc.cfg diff --git a/det-yolov4-training/cfg/yolov2-tiny.cfg b/det-yolov4-tmi/cfg/yolov2-tiny.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov2-tiny.cfg rename to det-yolov4-tmi/cfg/yolov2-tiny.cfg diff --git a/det-yolov4-training/cfg/yolov2-voc.cfg b/det-yolov4-tmi/cfg/yolov2-voc.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov2-voc.cfg rename to det-yolov4-tmi/cfg/yolov2-voc.cfg diff --git a/det-yolov4-training/cfg/yolov2.cfg b/det-yolov4-tmi/cfg/yolov2.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov2.cfg rename to det-yolov4-tmi/cfg/yolov2.cfg diff --git a/det-yolov4-training/cfg/yolov3-openimages.cfg b/det-yolov4-tmi/cfg/yolov3-openimages.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-openimages.cfg rename to det-yolov4-tmi/cfg/yolov3-openimages.cfg diff --git a/det-yolov4-training/cfg/yolov3-spp.cfg b/det-yolov4-tmi/cfg/yolov3-spp.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-spp.cfg rename to det-yolov4-tmi/cfg/yolov3-spp.cfg diff --git a/det-yolov4-training/cfg/yolov3-tiny-prn.cfg b/det-yolov4-tmi/cfg/yolov3-tiny-prn.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-tiny-prn.cfg rename to det-yolov4-tmi/cfg/yolov3-tiny-prn.cfg diff --git a/det-yolov4-training/cfg/yolov3-tiny.cfg b/det-yolov4-tmi/cfg/yolov3-tiny.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-tiny.cfg rename to det-yolov4-tmi/cfg/yolov3-tiny.cfg diff --git a/det-yolov4-training/cfg/yolov3-tiny_3l.cfg b/det-yolov4-tmi/cfg/yolov3-tiny_3l.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-tiny_3l.cfg rename to det-yolov4-tmi/cfg/yolov3-tiny_3l.cfg diff --git a/det-yolov4-training/cfg/yolov3-tiny_obj.cfg b/det-yolov4-tmi/cfg/yolov3-tiny_obj.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-tiny_obj.cfg rename to det-yolov4-tmi/cfg/yolov3-tiny_obj.cfg diff --git a/det-yolov4-training/cfg/yolov3-tiny_occlusion_track.cfg b/det-yolov4-tmi/cfg/yolov3-tiny_occlusion_track.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-tiny_occlusion_track.cfg rename to det-yolov4-tmi/cfg/yolov3-tiny_occlusion_track.cfg diff --git a/det-yolov4-training/cfg/yolov3-tiny_xnor.cfg b/det-yolov4-tmi/cfg/yolov3-tiny_xnor.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-tiny_xnor.cfg rename to det-yolov4-tmi/cfg/yolov3-tiny_xnor.cfg diff --git a/det-yolov4-training/cfg/yolov3-voc.cfg b/det-yolov4-tmi/cfg/yolov3-voc.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-voc.cfg rename to det-yolov4-tmi/cfg/yolov3-voc.cfg diff --git a/det-yolov4-training/cfg/yolov3-voc.yolov3-giou-40.cfg b/det-yolov4-tmi/cfg/yolov3-voc.yolov3-giou-40.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3-voc.yolov3-giou-40.cfg rename to det-yolov4-tmi/cfg/yolov3-voc.yolov3-giou-40.cfg diff --git a/det-yolov4-training/cfg/yolov3.cfg b/det-yolov4-tmi/cfg/yolov3.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3.cfg rename to det-yolov4-tmi/cfg/yolov3.cfg diff --git a/det-yolov4-training/cfg/yolov3.coco-giou-12.cfg b/det-yolov4-tmi/cfg/yolov3.coco-giou-12.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3.coco-giou-12.cfg rename to det-yolov4-tmi/cfg/yolov3.coco-giou-12.cfg diff --git a/det-yolov4-training/cfg/yolov3_5l.cfg b/det-yolov4-tmi/cfg/yolov3_5l.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov3_5l.cfg rename to det-yolov4-tmi/cfg/yolov3_5l.cfg diff --git a/det-yolov4-training/cfg/yolov4-csp-swish.cfg b/det-yolov4-tmi/cfg/yolov4-csp-swish.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-csp-swish.cfg rename to det-yolov4-tmi/cfg/yolov4-csp-swish.cfg diff --git a/det-yolov4-training/cfg/yolov4-csp-x-swish-frozen.cfg b/det-yolov4-tmi/cfg/yolov4-csp-x-swish-frozen.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-csp-x-swish-frozen.cfg rename to det-yolov4-tmi/cfg/yolov4-csp-x-swish-frozen.cfg diff --git a/det-yolov4-training/cfg/yolov4-csp-x-swish.cfg b/det-yolov4-tmi/cfg/yolov4-csp-x-swish.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-csp-x-swish.cfg rename to det-yolov4-tmi/cfg/yolov4-csp-x-swish.cfg diff --git a/det-yolov4-training/cfg/yolov4-csp.cfg b/det-yolov4-tmi/cfg/yolov4-csp.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-csp.cfg rename to det-yolov4-tmi/cfg/yolov4-csp.cfg diff --git a/det-yolov4-training/cfg/yolov4-custom.cfg b/det-yolov4-tmi/cfg/yolov4-custom.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-custom.cfg rename to det-yolov4-tmi/cfg/yolov4-custom.cfg diff --git a/det-yolov4-training/cfg/yolov4-p5-frozen.cfg b/det-yolov4-tmi/cfg/yolov4-p5-frozen.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-p5-frozen.cfg rename to det-yolov4-tmi/cfg/yolov4-p5-frozen.cfg diff --git a/det-yolov4-training/cfg/yolov4-p5.cfg b/det-yolov4-tmi/cfg/yolov4-p5.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-p5.cfg rename to det-yolov4-tmi/cfg/yolov4-p5.cfg diff --git a/det-yolov4-training/cfg/yolov4-p6.cfg b/det-yolov4-tmi/cfg/yolov4-p6.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-p6.cfg rename to det-yolov4-tmi/cfg/yolov4-p6.cfg diff --git a/det-yolov4-training/cfg/yolov4-sam-mish-csp-reorg-bfm.cfg b/det-yolov4-tmi/cfg/yolov4-sam-mish-csp-reorg-bfm.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-sam-mish-csp-reorg-bfm.cfg rename to det-yolov4-tmi/cfg/yolov4-sam-mish-csp-reorg-bfm.cfg diff --git a/det-yolov4-training/cfg/yolov4-tiny-3l.cfg b/det-yolov4-tmi/cfg/yolov4-tiny-3l.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-tiny-3l.cfg rename to det-yolov4-tmi/cfg/yolov4-tiny-3l.cfg diff --git a/det-yolov4-training/cfg/yolov4-tiny-custom.cfg b/det-yolov4-tmi/cfg/yolov4-tiny-custom.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-tiny-custom.cfg rename to det-yolov4-tmi/cfg/yolov4-tiny-custom.cfg diff --git a/det-yolov4-training/cfg/yolov4-tiny.cfg b/det-yolov4-tmi/cfg/yolov4-tiny.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-tiny.cfg rename to det-yolov4-tmi/cfg/yolov4-tiny.cfg diff --git a/det-yolov4-training/cfg/yolov4-tiny_contrastive.cfg b/det-yolov4-tmi/cfg/yolov4-tiny_contrastive.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4-tiny_contrastive.cfg rename to det-yolov4-tmi/cfg/yolov4-tiny_contrastive.cfg diff --git a/det-yolov4-training/cfg/yolov4.cfg b/det-yolov4-tmi/cfg/yolov4.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4.cfg rename to det-yolov4-tmi/cfg/yolov4.cfg diff --git a/det-yolov4-training/cfg/yolov4_iter1000.cfg b/det-yolov4-tmi/cfg/yolov4_iter1000.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4_iter1000.cfg rename to det-yolov4-tmi/cfg/yolov4_iter1000.cfg diff --git a/det-yolov4-training/cfg/yolov4x-mish.cfg b/det-yolov4-tmi/cfg/yolov4x-mish.cfg similarity index 100% rename from det-yolov4-training/cfg/yolov4x-mish.cfg rename to det-yolov4-tmi/cfg/yolov4x-mish.cfg diff --git a/det-yolov4-training/cmake/Modules/FindCUDNN.cmake b/det-yolov4-tmi/cmake/Modules/FindCUDNN.cmake similarity index 100% rename from det-yolov4-training/cmake/Modules/FindCUDNN.cmake rename to det-yolov4-tmi/cmake/Modules/FindCUDNN.cmake diff --git a/det-yolov4-training/cmake/Modules/FindPThreads4W.cmake b/det-yolov4-tmi/cmake/Modules/FindPThreads4W.cmake similarity index 100% rename from det-yolov4-training/cmake/Modules/FindPThreads4W.cmake rename to det-yolov4-tmi/cmake/Modules/FindPThreads4W.cmake diff --git a/det-yolov4-training/cmake/Modules/FindStb.cmake b/det-yolov4-tmi/cmake/Modules/FindStb.cmake similarity index 100% rename from det-yolov4-training/cmake/Modules/FindStb.cmake rename to det-yolov4-tmi/cmake/Modules/FindStb.cmake diff --git a/det-yolov4-training/config_and_train.py b/det-yolov4-tmi/config_and_train.py similarity index 100% rename from det-yolov4-training/config_and_train.py rename to det-yolov4-tmi/config_and_train.py diff --git a/det-yolov4-training/convert_label_ark2txt.py b/det-yolov4-tmi/convert_label_ark2txt.py similarity index 92% rename from det-yolov4-training/convert_label_ark2txt.py rename to det-yolov4-tmi/convert_label_ark2txt.py index 1043b53..2e963f7 100755 --- a/det-yolov4-training/convert_label_ark2txt.py +++ b/det-yolov4-tmi/convert_label_ark2txt.py @@ -1,6 +1,6 @@ import os +import imagesize -import cv2 def _annotation_path_for_image(image_path: str, annotations_dir: str) -> str: @@ -21,18 +21,16 @@ def _convert_annotations(index_file_path: str, dst_annotations_dir: str) -> None files = f.readlines() files = [each.strip() for each in files] + N = len(files) for i, each_img_anno_path in enumerate(files): if i % 1000 == 0: - print(f"converted {i} image annotations") + print(f"converted {i}/{N} image annotations") # each_imgpath: asset path # each_txtfile: annotation path each_imgpath, each_txtfile = each_img_anno_path.split() - img = cv2.imread(each_imgpath) - if img is None: - raise ValueError(f"can not read image: {each_imgpath}") - img_h, img_w, _ = img.shape + img_w, img_h = imagesize.get(each_imgpath) with open(each_txtfile, 'r') as f: txt_content = f.readlines() diff --git a/det-yolov4-training/convert_model_darknet2mxnet_yolov4.py b/det-yolov4-tmi/convert_model_darknet2mxnet_yolov4.py similarity index 100% rename from det-yolov4-training/convert_model_darknet2mxnet_yolov4.py rename to det-yolov4-tmi/convert_model_darknet2mxnet_yolov4.py diff --git a/det-yolov4-training/counters_per_class.txt b/det-yolov4-tmi/counters_per_class.txt similarity index 100% rename from det-yolov4-training/counters_per_class.txt rename to det-yolov4-tmi/counters_per_class.txt diff --git a/det-yolov4-training/Dockerfile b/det-yolov4-tmi/cuda101.dockerfile similarity index 82% rename from det-yolov4-training/Dockerfile rename to det-yolov4-tmi/cuda101.dockerfile index 61ce1f6..66273c3 100644 --- a/det-yolov4-training/Dockerfile +++ b/det-yolov4-tmi/cuda101.dockerfile @@ -1,5 +1,8 @@ FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04 ARG PIP_SOURCE=https://pypi.mirrors.ustc.edu.cn/simple + +ENV PYTHONPATH=. + WORKDIR /darknet RUN sed -i 's#http://archive.ubuntu.com#https://mirrors.ustc.edu.cn#g' /etc/apt/sources.list RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC && apt-get update @@ -12,11 +15,13 @@ RUN wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_o RUN rm /usr/bin/python3 RUN ln -s /usr/bin/python3.7 /usr/bin/python3 RUN python3 get-pip.py -RUN pip3 install -i ${PIP_SOURCE} mxnet-cu101==1.5.1 numpy opencv-python pyyaml watchdog tensorboardX six +RUN pip3 install -i ${PIP_SOURCE} mxnet-cu101==1.5.1 numpy opencv-python pyyaml watchdog tensorboardX six scipy tqdm imagesize + ENV DEBIAN_FRONTEND noninteractive RUN apt-get update && apt-get install -y libopencv-dev COPY . /darknet -RUN cp /darknet/make_train_test_darknet.sh /usr/bin/start.sh -RUN mkdir /img-man && cp /darknet/training-template.yaml /img-man/training-template.yaml RUN make -j + +RUN mkdir /img-man && cp /darknet/training-template.yaml /img-man/training-template.yaml && cp /darknet/mining/*-template.yaml /img-man +RUN echo "python3 /darknet/start.py" > /usr/bin/start.sh CMD bash /usr/bin/start.sh diff --git a/det-yolov4-training/cuda112.dockerfile b/det-yolov4-tmi/cuda112.dockerfile similarity index 82% rename from det-yolov4-training/cuda112.dockerfile rename to det-yolov4-tmi/cuda112.dockerfile index 3e6884b..bab5c7d 100644 --- a/det-yolov4-training/cuda112.dockerfile +++ b/det-yolov4-tmi/cuda112.dockerfile @@ -1,5 +1,8 @@ FROM nvidia/cuda:11.2.1-cudnn8-devel-ubuntu18.04 ARG PIP_SOURCE=https://pypi.mirrors.ustc.edu.cn/simple + +ENV PYTHONPATH=. + WORKDIR /darknet RUN sed -i 's#http://archive.ubuntu.com#https://mirrors.ustc.edu.cn#g' /etc/apt/sources.list RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC && apt-get update @@ -12,12 +15,13 @@ RUN wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_o RUN rm /usr/bin/python3 RUN ln -s /usr/bin/python3.7 /usr/bin/python3 RUN python3 get-pip.py -RUN pip3 install -i ${PIP_SOURCE} mxnet-cu112==1.9.1 numpy opencv-python pyyaml watchdog tensorboardX six +RUN pip3 install -i ${PIP_SOURCE} mxnet-cu112==1.9.1 numpy opencv-python pyyaml watchdog tensorboardX six scipy tqdm imagesize ENV DEBIAN_FRONTEND noninteractive RUN apt-get update && apt-get install -y libopencv-dev COPY . /darknet -RUN cp /darknet/make_train_test_darknet.sh /usr/bin/start.sh -RUN mkdir /img-man && cp /darknet/training-template.yaml /img-man/training-template.yaml RUN make -j + +RUN mkdir /img-man && cp /darknet/training-template.yaml /img-man/training-template.yaml && cp /darknet/mining/*-template.yaml /img-man +RUN echo "python3 /darknet/start.py" > /usr/bin/start.sh CMD bash /usr/bin/start.sh diff --git a/det-yolov4-training/darknet.py b/det-yolov4-tmi/darknet.py similarity index 100% rename from det-yolov4-training/darknet.py rename to det-yolov4-tmi/darknet.py diff --git a/det-yolov4-training/darknet_images.py b/det-yolov4-tmi/darknet_images.py similarity index 100% rename from det-yolov4-training/darknet_images.py rename to det-yolov4-tmi/darknet_images.py diff --git a/det-yolov4-training/darknet_video.py b/det-yolov4-tmi/darknet_video.py similarity index 100% rename from det-yolov4-training/darknet_video.py rename to det-yolov4-tmi/darknet_video.py diff --git a/det-yolov4-training/data/9k.tree b/det-yolov4-tmi/data/9k.tree similarity index 100% rename from det-yolov4-training/data/9k.tree rename to det-yolov4-tmi/data/9k.tree diff --git a/det-yolov4-training/data/coco.names b/det-yolov4-tmi/data/coco.names similarity index 100% rename from det-yolov4-training/data/coco.names rename to det-yolov4-tmi/data/coco.names diff --git a/det-yolov4-training/data/coco9k.map b/det-yolov4-tmi/data/coco9k.map similarity index 100% rename from det-yolov4-training/data/coco9k.map rename to det-yolov4-tmi/data/coco9k.map diff --git a/det-yolov4-training/data/goal.txt b/det-yolov4-tmi/data/goal.txt similarity index 100% rename from det-yolov4-training/data/goal.txt rename to det-yolov4-tmi/data/goal.txt diff --git a/det-yolov4-training/data/imagenet.labels.list b/det-yolov4-tmi/data/imagenet.labels.list similarity index 100% rename from det-yolov4-training/data/imagenet.labels.list rename to det-yolov4-tmi/data/imagenet.labels.list diff --git a/det-yolov4-training/data/imagenet.shortnames.list b/det-yolov4-tmi/data/imagenet.shortnames.list similarity index 100% rename from det-yolov4-training/data/imagenet.shortnames.list rename to det-yolov4-tmi/data/imagenet.shortnames.list diff --git a/det-yolov4-training/data/labels/make_labels.py b/det-yolov4-tmi/data/labels/make_labels.py similarity index 100% rename from det-yolov4-training/data/labels/make_labels.py rename to det-yolov4-tmi/data/labels/make_labels.py diff --git a/det-yolov4-training/data/openimages.names b/det-yolov4-tmi/data/openimages.names similarity index 100% rename from det-yolov4-training/data/openimages.names rename to det-yolov4-tmi/data/openimages.names diff --git a/det-yolov4-training/data/voc.names b/det-yolov4-tmi/data/voc.names similarity index 100% rename from det-yolov4-training/data/voc.names rename to det-yolov4-tmi/data/voc.names diff --git a/det-yolov4-training/image_yolov3.sh b/det-yolov4-tmi/image_yolov3.sh similarity index 100% rename from det-yolov4-training/image_yolov3.sh rename to det-yolov4-tmi/image_yolov3.sh diff --git a/det-yolov4-training/image_yolov4.sh b/det-yolov4-tmi/image_yolov4.sh similarity index 100% rename from det-yolov4-training/image_yolov4.sh rename to det-yolov4-tmi/image_yolov4.sh diff --git a/det-yolov4-training/img.txt b/det-yolov4-tmi/img.txt similarity index 100% rename from det-yolov4-training/img.txt rename to det-yolov4-tmi/img.txt diff --git a/det-yolov4-training/include/darknet.h b/det-yolov4-tmi/include/darknet.h similarity index 100% rename from det-yolov4-training/include/darknet.h rename to det-yolov4-tmi/include/darknet.h diff --git a/det-yolov4-training/include/yolo_v2_class.hpp b/det-yolov4-tmi/include/yolo_v2_class.hpp similarity index 100% rename from det-yolov4-training/include/yolo_v2_class.hpp rename to det-yolov4-tmi/include/yolo_v2_class.hpp diff --git a/det-yolov4-training/json_mjpeg_streams.sh b/det-yolov4-tmi/json_mjpeg_streams.sh similarity index 100% rename from det-yolov4-training/json_mjpeg_streams.sh rename to det-yolov4-tmi/json_mjpeg_streams.sh diff --git a/det-yolov4-training/make_train_test_darknet.sh b/det-yolov4-tmi/make_train_test_darknet.sh similarity index 100% rename from det-yolov4-training/make_train_test_darknet.sh rename to det-yolov4-tmi/make_train_test_darknet.sh diff --git a/det-yolov4-mining/.dockerignore b/det-yolov4-tmi/mining/.dockerignore similarity index 100% rename from det-yolov4-mining/.dockerignore rename to det-yolov4-tmi/mining/.dockerignore diff --git a/det-yolov4-mining/README.md b/det-yolov4-tmi/mining/README.md similarity index 100% rename from det-yolov4-mining/README.md rename to det-yolov4-tmi/mining/README.md diff --git a/det-yolov4-mining/active_learning/__init__.py b/det-yolov4-tmi/mining/active_learning/__init__.py similarity index 100% rename from det-yolov4-mining/active_learning/__init__.py rename to det-yolov4-tmi/mining/active_learning/__init__.py diff --git a/det-yolov4-mining/active_learning/apis/__init__.py b/det-yolov4-tmi/mining/active_learning/apis/__init__.py similarity index 100% rename from det-yolov4-mining/active_learning/apis/__init__.py rename to det-yolov4-tmi/mining/active_learning/apis/__init__.py diff --git a/det-yolov4-mining/active_learning/apis/al_api.py b/det-yolov4-tmi/mining/active_learning/apis/al_api.py similarity index 100% rename from det-yolov4-mining/active_learning/apis/al_api.py rename to det-yolov4-tmi/mining/active_learning/apis/al_api.py diff --git a/det-yolov4-mining/active_learning/apis/docker_api.py b/det-yolov4-tmi/mining/active_learning/apis/docker_api.py similarity index 100% rename from det-yolov4-mining/active_learning/apis/docker_api.py rename to det-yolov4-tmi/mining/active_learning/apis/docker_api.py diff --git a/det-yolov4-mining/active_learning/dataset/__init__.py b/det-yolov4-tmi/mining/active_learning/dataset/__init__.py similarity index 100% rename from det-yolov4-mining/active_learning/dataset/__init__.py rename to det-yolov4-tmi/mining/active_learning/dataset/__init__.py diff --git a/det-yolov4-mining/active_learning/dataset/datareader.py b/det-yolov4-tmi/mining/active_learning/dataset/datareader.py similarity index 100% rename from det-yolov4-mining/active_learning/dataset/datareader.py rename to det-yolov4-tmi/mining/active_learning/dataset/datareader.py diff --git a/det-yolov4-mining/active_learning/dataset/labeled_dataset.py b/det-yolov4-tmi/mining/active_learning/dataset/labeled_dataset.py similarity index 100% rename from det-yolov4-mining/active_learning/dataset/labeled_dataset.py rename to det-yolov4-tmi/mining/active_learning/dataset/labeled_dataset.py diff --git a/det-yolov4-mining/active_learning/dataset/unlabeled_dataset.py b/det-yolov4-tmi/mining/active_learning/dataset/unlabeled_dataset.py similarity index 100% rename from det-yolov4-mining/active_learning/dataset/unlabeled_dataset.py rename to det-yolov4-tmi/mining/active_learning/dataset/unlabeled_dataset.py diff --git a/det-yolov4-mining/active_learning/model_inference/__init__.py b/det-yolov4-tmi/mining/active_learning/model_inference/__init__.py similarity index 100% rename from det-yolov4-mining/active_learning/model_inference/__init__.py rename to det-yolov4-tmi/mining/active_learning/model_inference/__init__.py diff --git a/det-yolov4-mining/active_learning/model_inference/centernet.py b/det-yolov4-tmi/mining/active_learning/model_inference/centernet.py similarity index 100% rename from det-yolov4-mining/active_learning/model_inference/centernet.py rename to det-yolov4-tmi/mining/active_learning/model_inference/centernet.py diff --git a/det-yolov4-mining/active_learning/model_inference/yolo_models.py b/det-yolov4-tmi/mining/active_learning/model_inference/yolo_models.py similarity index 100% rename from det-yolov4-mining/active_learning/model_inference/yolo_models.py rename to det-yolov4-tmi/mining/active_learning/model_inference/yolo_models.py diff --git a/det-yolov4-mining/active_learning/strategy/__init__.py b/det-yolov4-tmi/mining/active_learning/strategy/__init__.py similarity index 100% rename from det-yolov4-mining/active_learning/strategy/__init__.py rename to det-yolov4-tmi/mining/active_learning/strategy/__init__.py diff --git a/det-yolov4-mining/active_learning/strategy/aldd.py b/det-yolov4-tmi/mining/active_learning/strategy/aldd.py similarity index 100% rename from det-yolov4-mining/active_learning/strategy/aldd.py rename to det-yolov4-tmi/mining/active_learning/strategy/aldd.py diff --git a/det-yolov4-mining/active_learning/strategy/aldd_yolo.py b/det-yolov4-tmi/mining/active_learning/strategy/aldd_yolo.py similarity index 100% rename from det-yolov4-mining/active_learning/strategy/aldd_yolo.py rename to det-yolov4-tmi/mining/active_learning/strategy/aldd_yolo.py diff --git a/det-yolov4-mining/active_learning/strategy/cald.py b/det-yolov4-tmi/mining/active_learning/strategy/cald.py similarity index 100% rename from det-yolov4-mining/active_learning/strategy/cald.py rename to det-yolov4-tmi/mining/active_learning/strategy/cald.py diff --git a/det-yolov4-mining/active_learning/strategy/data_augment.py b/det-yolov4-tmi/mining/active_learning/strategy/data_augment.py similarity index 100% rename from det-yolov4-mining/active_learning/strategy/data_augment.py rename to det-yolov4-tmi/mining/active_learning/strategy/data_augment.py diff --git a/det-yolov4-mining/active_learning/strategy/random_strategy.py b/det-yolov4-tmi/mining/active_learning/strategy/random_strategy.py similarity index 100% rename from det-yolov4-mining/active_learning/strategy/random_strategy.py rename to det-yolov4-tmi/mining/active_learning/strategy/random_strategy.py diff --git a/det-yolov4-mining/active_learning/utils/__init__.py b/det-yolov4-tmi/mining/active_learning/utils/__init__.py similarity index 100% rename from det-yolov4-mining/active_learning/utils/__init__.py rename to det-yolov4-tmi/mining/active_learning/utils/__init__.py diff --git a/det-yolov4-mining/active_learning/utils/al_log.py b/det-yolov4-tmi/mining/active_learning/utils/al_log.py similarity index 100% rename from det-yolov4-mining/active_learning/utils/al_log.py rename to det-yolov4-tmi/mining/active_learning/utils/al_log.py diff --git a/det-yolov4-mining/active_learning/utils/operator.py b/det-yolov4-tmi/mining/active_learning/utils/operator.py similarity index 100% rename from det-yolov4-mining/active_learning/utils/operator.py rename to det-yolov4-tmi/mining/active_learning/utils/operator.py diff --git a/det-yolov4-mining/al_main.py b/det-yolov4-tmi/mining/al_main.py similarity index 100% rename from det-yolov4-mining/al_main.py rename to det-yolov4-tmi/mining/al_main.py diff --git a/det-yolov4-mining/combined_class.txt b/det-yolov4-tmi/mining/combined_class.txt similarity index 100% rename from det-yolov4-mining/combined_class.txt rename to det-yolov4-tmi/mining/combined_class.txt diff --git a/det-yolov4-mining/docker_main.py b/det-yolov4-tmi/mining/docker_main.py similarity index 89% rename from det-yolov4-mining/docker_main.py rename to det-yolov4-tmi/mining/docker_main.py index 3eb4641..359d066 100644 --- a/det-yolov4-mining/docker_main.py +++ b/det-yolov4-tmi/mining/docker_main.py @@ -9,8 +9,8 @@ import write_result -def _load_config() -> dict: - with open("/in/config.yaml", "r", encoding='utf8') as f: +def _load_config(config_file) -> dict: + with open(config_file, "r", encoding='utf8') as f: config = yaml.safe_load(f) # set default task id @@ -34,10 +34,12 @@ def _load_config() -> dict: if __name__ == '__main__': - config = _load_config() + config = _load_config("/in/config.yaml") - run_infer = int(config['run_infer']) - run_mining = int(config['run_mining']) + with open("/in/env.yaml", "r", encoding='utf8') as f: + env_config = yaml.safe_load(f) + run_infer = int(env_config['run_infer']) + run_mining = int(env_config['run_mining']) if not run_infer and not run_mining: raise ValueError('both run_infer and run_mining set to 0, abort') diff --git a/det-yolov4-mining/docker_readme.md b/det-yolov4-tmi/mining/docker_readme.md similarity index 100% rename from det-yolov4-mining/docker_readme.md rename to det-yolov4-tmi/mining/docker_readme.md diff --git a/det-yolov4-mining/infer-template.yaml b/det-yolov4-tmi/mining/infer-template.yaml similarity index 97% rename from det-yolov4-mining/infer-template.yaml rename to det-yolov4-tmi/mining/infer-template.yaml index dce6501..11c6502 100644 --- a/det-yolov4-mining/infer-template.yaml +++ b/det-yolov4-tmi/mining/infer-template.yaml @@ -14,7 +14,7 @@ write_result: True confidence_thresh: 0.1 nms_thresh: 0.45 max_boxes: 50 -# shm_size: '16G' +shm_size: '128G' # gpu_id: '' # model_params_path: [] # class_names: diff --git a/det-yolov4-mining/mining-template.yaml b/det-yolov4-tmi/mining/mining-template.yaml similarity index 93% rename from det-yolov4-mining/mining-template.yaml rename to det-yolov4-tmi/mining/mining-template.yaml index e02770f..2ff8270 100644 --- a/det-yolov4-mining/mining-template.yaml +++ b/det-yolov4-tmi/mining/mining-template.yaml @@ -13,14 +13,14 @@ model_type: detection strategy: aldd_yolo image_height: 608 image_width: 608 -batch_size: 16 +batch_size: 4 anchors: '12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401' confidence_thresh: 0.1 nms_thresh: 0.45 max_boxes: 50 -# shm_size: '16G' +shm_size: '128G' # gpu_id: '0,1,2,3' # model_params_path: [] # task_id: cycle-node-mined-0 # class_names: -# - expose_rubbish \ No newline at end of file +# - expose_rubbish diff --git a/det-yolov4-mining/monitor_process.py b/det-yolov4-tmi/mining/monitor_process.py similarity index 100% rename from det-yolov4-mining/monitor_process.py rename to det-yolov4-tmi/mining/monitor_process.py diff --git a/det-yolov4-mining/start.sh b/det-yolov4-tmi/mining/start.sh similarity index 100% rename from det-yolov4-mining/start.sh rename to det-yolov4-tmi/mining/start.sh diff --git a/det-yolov4-mining/test_api.py b/det-yolov4-tmi/mining/test_api.py similarity index 100% rename from det-yolov4-mining/test_api.py rename to det-yolov4-tmi/mining/test_api.py diff --git a/det-yolov4-mining/test_centernet.py b/det-yolov4-tmi/mining/test_centernet.py similarity index 100% rename from det-yolov4-mining/test_centernet.py rename to det-yolov4-tmi/mining/test_centernet.py diff --git a/det-yolov4-mining/tools/al_strategsy_union.py b/det-yolov4-tmi/mining/tools/al_strategsy_union.py similarity index 100% rename from det-yolov4-mining/tools/al_strategsy_union.py rename to det-yolov4-tmi/mining/tools/al_strategsy_union.py diff --git a/det-yolov4-mining/tools/imagenet_hard_negative.py b/det-yolov4-tmi/mining/tools/imagenet_hard_negative.py similarity index 100% rename from det-yolov4-mining/tools/imagenet_hard_negative.py rename to det-yolov4-tmi/mining/tools/imagenet_hard_negative.py diff --git a/det-yolov4-mining/tools/plot_dataset_class_hist.py b/det-yolov4-tmi/mining/tools/plot_dataset_class_hist.py similarity index 100% rename from det-yolov4-mining/tools/plot_dataset_class_hist.py rename to det-yolov4-tmi/mining/tools/plot_dataset_class_hist.py diff --git a/det-yolov4-mining/tools/visualize_aldd.py b/det-yolov4-tmi/mining/tools/visualize_aldd.py similarity index 100% rename from det-yolov4-mining/tools/visualize_aldd.py rename to det-yolov4-tmi/mining/tools/visualize_aldd.py diff --git a/det-yolov4-mining/tools/visualize_cald.py b/det-yolov4-tmi/mining/tools/visualize_cald.py similarity index 100% rename from det-yolov4-mining/tools/visualize_cald.py rename to det-yolov4-tmi/mining/tools/visualize_cald.py diff --git a/det-yolov4-mining/write_result.py b/det-yolov4-tmi/mining/write_result.py similarity index 100% rename from det-yolov4-mining/write_result.py rename to det-yolov4-tmi/mining/write_result.py diff --git a/det-yolov4-training/net_cam_v3.sh b/det-yolov4-tmi/net_cam_v3.sh similarity index 100% rename from det-yolov4-training/net_cam_v3.sh rename to det-yolov4-tmi/net_cam_v3.sh diff --git a/det-yolov4-training/net_cam_v4.sh b/det-yolov4-tmi/net_cam_v4.sh similarity index 100% rename from det-yolov4-training/net_cam_v4.sh rename to det-yolov4-tmi/net_cam_v4.sh diff --git a/det-yolov4-training/src/.editorconfig b/det-yolov4-tmi/src/.editorconfig similarity index 100% rename from det-yolov4-training/src/.editorconfig rename to det-yolov4-tmi/src/.editorconfig diff --git a/det-yolov4-training/src/activation_kernels.cu b/det-yolov4-tmi/src/activation_kernels.cu similarity index 100% rename from det-yolov4-training/src/activation_kernels.cu rename to det-yolov4-tmi/src/activation_kernels.cu diff --git a/det-yolov4-training/src/activation_layer.c b/det-yolov4-tmi/src/activation_layer.c similarity index 100% rename from det-yolov4-training/src/activation_layer.c rename to det-yolov4-tmi/src/activation_layer.c diff --git a/det-yolov4-training/src/activation_layer.h b/det-yolov4-tmi/src/activation_layer.h similarity index 100% rename from det-yolov4-training/src/activation_layer.h rename to det-yolov4-tmi/src/activation_layer.h diff --git a/det-yolov4-training/src/activations.c b/det-yolov4-tmi/src/activations.c similarity index 100% rename from det-yolov4-training/src/activations.c rename to det-yolov4-tmi/src/activations.c diff --git a/det-yolov4-training/src/activations.h b/det-yolov4-tmi/src/activations.h similarity index 100% rename from det-yolov4-training/src/activations.h rename to det-yolov4-tmi/src/activations.h diff --git a/det-yolov4-training/src/art.c b/det-yolov4-tmi/src/art.c similarity index 100% rename from det-yolov4-training/src/art.c rename to det-yolov4-tmi/src/art.c diff --git a/det-yolov4-training/src/avgpool_layer.c b/det-yolov4-tmi/src/avgpool_layer.c similarity index 100% rename from det-yolov4-training/src/avgpool_layer.c rename to det-yolov4-tmi/src/avgpool_layer.c diff --git a/det-yolov4-training/src/avgpool_layer.h b/det-yolov4-tmi/src/avgpool_layer.h similarity index 100% rename from det-yolov4-training/src/avgpool_layer.h rename to det-yolov4-tmi/src/avgpool_layer.h diff --git a/det-yolov4-training/src/avgpool_layer_kernels.cu b/det-yolov4-tmi/src/avgpool_layer_kernels.cu similarity index 100% rename from det-yolov4-training/src/avgpool_layer_kernels.cu rename to det-yolov4-tmi/src/avgpool_layer_kernels.cu diff --git a/det-yolov4-training/src/batchnorm_layer.c b/det-yolov4-tmi/src/batchnorm_layer.c similarity index 100% rename from det-yolov4-training/src/batchnorm_layer.c rename to det-yolov4-tmi/src/batchnorm_layer.c diff --git a/det-yolov4-training/src/batchnorm_layer.h b/det-yolov4-tmi/src/batchnorm_layer.h similarity index 100% rename from det-yolov4-training/src/batchnorm_layer.h rename to det-yolov4-tmi/src/batchnorm_layer.h diff --git a/det-yolov4-training/src/blas.c b/det-yolov4-tmi/src/blas.c similarity index 100% rename from det-yolov4-training/src/blas.c rename to det-yolov4-tmi/src/blas.c diff --git a/det-yolov4-training/src/blas.h b/det-yolov4-tmi/src/blas.h similarity index 100% rename from det-yolov4-training/src/blas.h rename to det-yolov4-tmi/src/blas.h diff --git a/det-yolov4-training/src/blas_kernels.cu b/det-yolov4-tmi/src/blas_kernels.cu similarity index 100% rename from det-yolov4-training/src/blas_kernels.cu rename to det-yolov4-tmi/src/blas_kernels.cu diff --git a/det-yolov4-training/src/box.c b/det-yolov4-tmi/src/box.c similarity index 100% rename from det-yolov4-training/src/box.c rename to det-yolov4-tmi/src/box.c diff --git a/det-yolov4-training/src/box.h b/det-yolov4-tmi/src/box.h similarity index 100% rename from det-yolov4-training/src/box.h rename to det-yolov4-tmi/src/box.h diff --git a/det-yolov4-training/src/captcha.c b/det-yolov4-tmi/src/captcha.c similarity index 100% rename from det-yolov4-training/src/captcha.c rename to det-yolov4-tmi/src/captcha.c diff --git a/det-yolov4-training/src/cifar.c b/det-yolov4-tmi/src/cifar.c similarity index 100% rename from det-yolov4-training/src/cifar.c rename to det-yolov4-tmi/src/cifar.c diff --git a/det-yolov4-training/src/classifier.c b/det-yolov4-tmi/src/classifier.c similarity index 100% rename from det-yolov4-training/src/classifier.c rename to det-yolov4-tmi/src/classifier.c diff --git a/det-yolov4-training/src/classifier.h b/det-yolov4-tmi/src/classifier.h similarity index 100% rename from det-yolov4-training/src/classifier.h rename to det-yolov4-tmi/src/classifier.h diff --git a/det-yolov4-training/src/coco.c b/det-yolov4-tmi/src/coco.c similarity index 100% rename from det-yolov4-training/src/coco.c rename to det-yolov4-tmi/src/coco.c diff --git a/det-yolov4-training/src/col2im.c b/det-yolov4-tmi/src/col2im.c similarity index 100% rename from det-yolov4-training/src/col2im.c rename to det-yolov4-tmi/src/col2im.c diff --git a/det-yolov4-training/src/col2im.h b/det-yolov4-tmi/src/col2im.h similarity index 100% rename from det-yolov4-training/src/col2im.h rename to det-yolov4-tmi/src/col2im.h diff --git a/det-yolov4-training/src/col2im_kernels.cu b/det-yolov4-tmi/src/col2im_kernels.cu similarity index 100% rename from det-yolov4-training/src/col2im_kernels.cu rename to det-yolov4-tmi/src/col2im_kernels.cu diff --git a/det-yolov4-training/src/compare.c b/det-yolov4-tmi/src/compare.c similarity index 100% rename from det-yolov4-training/src/compare.c rename to det-yolov4-tmi/src/compare.c diff --git a/det-yolov4-training/src/connected_layer.c b/det-yolov4-tmi/src/connected_layer.c similarity index 100% rename from det-yolov4-training/src/connected_layer.c rename to det-yolov4-tmi/src/connected_layer.c diff --git a/det-yolov4-training/src/connected_layer.h b/det-yolov4-tmi/src/connected_layer.h similarity index 100% rename from det-yolov4-training/src/connected_layer.h rename to det-yolov4-tmi/src/connected_layer.h diff --git a/det-yolov4-training/src/conv_lstm_layer.c b/det-yolov4-tmi/src/conv_lstm_layer.c similarity index 100% rename from det-yolov4-training/src/conv_lstm_layer.c rename to det-yolov4-tmi/src/conv_lstm_layer.c diff --git a/det-yolov4-training/src/conv_lstm_layer.h b/det-yolov4-tmi/src/conv_lstm_layer.h similarity index 100% rename from det-yolov4-training/src/conv_lstm_layer.h rename to det-yolov4-tmi/src/conv_lstm_layer.h diff --git a/det-yolov4-training/src/convolutional_kernels.cu b/det-yolov4-tmi/src/convolutional_kernels.cu similarity index 100% rename from det-yolov4-training/src/convolutional_kernels.cu rename to det-yolov4-tmi/src/convolutional_kernels.cu diff --git a/det-yolov4-training/src/convolutional_layer.c b/det-yolov4-tmi/src/convolutional_layer.c similarity index 100% rename from det-yolov4-training/src/convolutional_layer.c rename to det-yolov4-tmi/src/convolutional_layer.c diff --git a/det-yolov4-training/src/convolutional_layer.h b/det-yolov4-tmi/src/convolutional_layer.h similarity index 100% rename from det-yolov4-training/src/convolutional_layer.h rename to det-yolov4-tmi/src/convolutional_layer.h diff --git a/det-yolov4-training/src/cost_layer.c b/det-yolov4-tmi/src/cost_layer.c similarity index 100% rename from det-yolov4-training/src/cost_layer.c rename to det-yolov4-tmi/src/cost_layer.c diff --git a/det-yolov4-training/src/cost_layer.h b/det-yolov4-tmi/src/cost_layer.h similarity index 100% rename from det-yolov4-training/src/cost_layer.h rename to det-yolov4-tmi/src/cost_layer.h diff --git a/det-yolov4-training/src/cpu_gemm.c b/det-yolov4-tmi/src/cpu_gemm.c similarity index 100% rename from det-yolov4-training/src/cpu_gemm.c rename to det-yolov4-tmi/src/cpu_gemm.c diff --git a/det-yolov4-training/src/crnn_layer.c b/det-yolov4-tmi/src/crnn_layer.c similarity index 100% rename from det-yolov4-training/src/crnn_layer.c rename to det-yolov4-tmi/src/crnn_layer.c diff --git a/det-yolov4-training/src/crnn_layer.h b/det-yolov4-tmi/src/crnn_layer.h similarity index 100% rename from det-yolov4-training/src/crnn_layer.h rename to det-yolov4-tmi/src/crnn_layer.h diff --git a/det-yolov4-training/src/crop_layer.c b/det-yolov4-tmi/src/crop_layer.c similarity index 100% rename from det-yolov4-training/src/crop_layer.c rename to det-yolov4-tmi/src/crop_layer.c diff --git a/det-yolov4-training/src/crop_layer.h b/det-yolov4-tmi/src/crop_layer.h similarity index 100% rename from det-yolov4-training/src/crop_layer.h rename to det-yolov4-tmi/src/crop_layer.h diff --git a/det-yolov4-training/src/crop_layer_kernels.cu b/det-yolov4-tmi/src/crop_layer_kernels.cu similarity index 100% rename from det-yolov4-training/src/crop_layer_kernels.cu rename to det-yolov4-tmi/src/crop_layer_kernels.cu diff --git a/det-yolov4-training/src/csharp/CMakeLists.txt b/det-yolov4-tmi/src/csharp/CMakeLists.txt similarity index 100% rename from det-yolov4-training/src/csharp/CMakeLists.txt rename to det-yolov4-tmi/src/csharp/CMakeLists.txt diff --git a/det-yolov4-training/src/csharp/YoloCSharpWrapper.cs b/det-yolov4-tmi/src/csharp/YoloCSharpWrapper.cs similarity index 100% rename from det-yolov4-training/src/csharp/YoloCSharpWrapper.cs rename to det-yolov4-tmi/src/csharp/YoloCSharpWrapper.cs diff --git a/det-yolov4-training/src/dark_cuda.c b/det-yolov4-tmi/src/dark_cuda.c similarity index 100% rename from det-yolov4-training/src/dark_cuda.c rename to det-yolov4-tmi/src/dark_cuda.c diff --git a/det-yolov4-training/src/dark_cuda.h b/det-yolov4-tmi/src/dark_cuda.h similarity index 100% rename from det-yolov4-training/src/dark_cuda.h rename to det-yolov4-tmi/src/dark_cuda.h diff --git a/det-yolov4-training/src/darknet.c b/det-yolov4-tmi/src/darknet.c similarity index 100% rename from det-yolov4-training/src/darknet.c rename to det-yolov4-tmi/src/darknet.c diff --git a/det-yolov4-training/src/darkunistd.h b/det-yolov4-tmi/src/darkunistd.h similarity index 100% rename from det-yolov4-training/src/darkunistd.h rename to det-yolov4-tmi/src/darkunistd.h diff --git a/det-yolov4-training/src/data.c b/det-yolov4-tmi/src/data.c similarity index 100% rename from det-yolov4-training/src/data.c rename to det-yolov4-tmi/src/data.c diff --git a/det-yolov4-training/src/data.h b/det-yolov4-tmi/src/data.h similarity index 100% rename from det-yolov4-training/src/data.h rename to det-yolov4-tmi/src/data.h diff --git a/det-yolov4-training/src/deconvolutional_kernels.cu b/det-yolov4-tmi/src/deconvolutional_kernels.cu similarity index 100% rename from det-yolov4-training/src/deconvolutional_kernels.cu rename to det-yolov4-tmi/src/deconvolutional_kernels.cu diff --git a/det-yolov4-training/src/deconvolutional_layer.c b/det-yolov4-tmi/src/deconvolutional_layer.c similarity index 100% rename from det-yolov4-training/src/deconvolutional_layer.c rename to det-yolov4-tmi/src/deconvolutional_layer.c diff --git a/det-yolov4-training/src/deconvolutional_layer.h b/det-yolov4-tmi/src/deconvolutional_layer.h similarity index 100% rename from det-yolov4-training/src/deconvolutional_layer.h rename to det-yolov4-tmi/src/deconvolutional_layer.h diff --git a/det-yolov4-training/src/demo.c b/det-yolov4-tmi/src/demo.c similarity index 100% rename from det-yolov4-training/src/demo.c rename to det-yolov4-tmi/src/demo.c diff --git a/det-yolov4-training/src/demo.h b/det-yolov4-tmi/src/demo.h similarity index 100% rename from det-yolov4-training/src/demo.h rename to det-yolov4-tmi/src/demo.h diff --git a/det-yolov4-training/src/detection_layer.c b/det-yolov4-tmi/src/detection_layer.c similarity index 100% rename from det-yolov4-training/src/detection_layer.c rename to det-yolov4-tmi/src/detection_layer.c diff --git a/det-yolov4-training/src/detection_layer.h b/det-yolov4-tmi/src/detection_layer.h similarity index 100% rename from det-yolov4-training/src/detection_layer.h rename to det-yolov4-tmi/src/detection_layer.h diff --git a/det-yolov4-training/src/detector.c b/det-yolov4-tmi/src/detector.c similarity index 100% rename from det-yolov4-training/src/detector.c rename to det-yolov4-tmi/src/detector.c diff --git a/det-yolov4-training/src/dice.c b/det-yolov4-tmi/src/dice.c similarity index 100% rename from det-yolov4-training/src/dice.c rename to det-yolov4-tmi/src/dice.c diff --git a/det-yolov4-training/src/dropout_layer.c b/det-yolov4-tmi/src/dropout_layer.c similarity index 100% rename from det-yolov4-training/src/dropout_layer.c rename to det-yolov4-tmi/src/dropout_layer.c diff --git a/det-yolov4-training/src/dropout_layer.h b/det-yolov4-tmi/src/dropout_layer.h similarity index 100% rename from det-yolov4-training/src/dropout_layer.h rename to det-yolov4-tmi/src/dropout_layer.h diff --git a/det-yolov4-training/src/dropout_layer_kernels.cu b/det-yolov4-tmi/src/dropout_layer_kernels.cu similarity index 100% rename from det-yolov4-training/src/dropout_layer_kernels.cu rename to det-yolov4-tmi/src/dropout_layer_kernels.cu diff --git a/det-yolov4-training/src/gaussian_yolo_layer.c b/det-yolov4-tmi/src/gaussian_yolo_layer.c similarity index 100% rename from det-yolov4-training/src/gaussian_yolo_layer.c rename to det-yolov4-tmi/src/gaussian_yolo_layer.c diff --git a/det-yolov4-training/src/gaussian_yolo_layer.h b/det-yolov4-tmi/src/gaussian_yolo_layer.h similarity index 100% rename from det-yolov4-training/src/gaussian_yolo_layer.h rename to det-yolov4-tmi/src/gaussian_yolo_layer.h diff --git a/det-yolov4-training/src/gemm.c b/det-yolov4-tmi/src/gemm.c similarity index 100% rename from det-yolov4-training/src/gemm.c rename to det-yolov4-tmi/src/gemm.c diff --git a/det-yolov4-training/src/gemm.h b/det-yolov4-tmi/src/gemm.h similarity index 100% rename from det-yolov4-training/src/gemm.h rename to det-yolov4-tmi/src/gemm.h diff --git a/det-yolov4-training/src/getopt.c b/det-yolov4-tmi/src/getopt.c similarity index 100% rename from det-yolov4-training/src/getopt.c rename to det-yolov4-tmi/src/getopt.c diff --git a/det-yolov4-training/src/getopt.h b/det-yolov4-tmi/src/getopt.h similarity index 100% rename from det-yolov4-training/src/getopt.h rename to det-yolov4-tmi/src/getopt.h diff --git a/det-yolov4-training/src/gettimeofday.c b/det-yolov4-tmi/src/gettimeofday.c similarity index 100% rename from det-yolov4-training/src/gettimeofday.c rename to det-yolov4-tmi/src/gettimeofday.c diff --git a/det-yolov4-training/src/gettimeofday.h b/det-yolov4-tmi/src/gettimeofday.h similarity index 100% rename from det-yolov4-training/src/gettimeofday.h rename to det-yolov4-tmi/src/gettimeofday.h diff --git a/det-yolov4-training/src/go.c b/det-yolov4-tmi/src/go.c similarity index 100% rename from det-yolov4-training/src/go.c rename to det-yolov4-tmi/src/go.c diff --git a/det-yolov4-training/src/gru_layer.c b/det-yolov4-tmi/src/gru_layer.c similarity index 100% rename from det-yolov4-training/src/gru_layer.c rename to det-yolov4-tmi/src/gru_layer.c diff --git a/det-yolov4-training/src/gru_layer.h b/det-yolov4-tmi/src/gru_layer.h similarity index 100% rename from det-yolov4-training/src/gru_layer.h rename to det-yolov4-tmi/src/gru_layer.h diff --git a/det-yolov4-training/src/http_stream.cpp b/det-yolov4-tmi/src/http_stream.cpp similarity index 100% rename from det-yolov4-training/src/http_stream.cpp rename to det-yolov4-tmi/src/http_stream.cpp diff --git a/det-yolov4-training/src/http_stream.h b/det-yolov4-tmi/src/http_stream.h similarity index 100% rename from det-yolov4-training/src/http_stream.h rename to det-yolov4-tmi/src/http_stream.h diff --git a/det-yolov4-training/src/httplib.h b/det-yolov4-tmi/src/httplib.h similarity index 100% rename from det-yolov4-training/src/httplib.h rename to det-yolov4-tmi/src/httplib.h diff --git a/det-yolov4-training/src/im2col.c b/det-yolov4-tmi/src/im2col.c similarity index 100% rename from det-yolov4-training/src/im2col.c rename to det-yolov4-tmi/src/im2col.c diff --git a/det-yolov4-training/src/im2col.h b/det-yolov4-tmi/src/im2col.h similarity index 100% rename from det-yolov4-training/src/im2col.h rename to det-yolov4-tmi/src/im2col.h diff --git a/det-yolov4-training/src/im2col_kernels.cu b/det-yolov4-tmi/src/im2col_kernels.cu similarity index 100% rename from det-yolov4-training/src/im2col_kernels.cu rename to det-yolov4-tmi/src/im2col_kernels.cu diff --git a/det-yolov4-training/src/image.c b/det-yolov4-tmi/src/image.c similarity index 100% rename from det-yolov4-training/src/image.c rename to det-yolov4-tmi/src/image.c diff --git a/det-yolov4-training/src/image.h b/det-yolov4-tmi/src/image.h similarity index 100% rename from det-yolov4-training/src/image.h rename to det-yolov4-tmi/src/image.h diff --git a/det-yolov4-training/src/image_opencv.cpp b/det-yolov4-tmi/src/image_opencv.cpp similarity index 100% rename from det-yolov4-training/src/image_opencv.cpp rename to det-yolov4-tmi/src/image_opencv.cpp diff --git a/det-yolov4-training/src/image_opencv.h b/det-yolov4-tmi/src/image_opencv.h similarity index 100% rename from det-yolov4-training/src/image_opencv.h rename to det-yolov4-tmi/src/image_opencv.h diff --git a/det-yolov4-training/src/layer.c b/det-yolov4-tmi/src/layer.c similarity index 100% rename from det-yolov4-training/src/layer.c rename to det-yolov4-tmi/src/layer.c diff --git a/det-yolov4-training/src/layer.h b/det-yolov4-tmi/src/layer.h similarity index 100% rename from det-yolov4-training/src/layer.h rename to det-yolov4-tmi/src/layer.h diff --git a/det-yolov4-training/src/list.c b/det-yolov4-tmi/src/list.c similarity index 100% rename from det-yolov4-training/src/list.c rename to det-yolov4-tmi/src/list.c diff --git a/det-yolov4-training/src/list.h b/det-yolov4-tmi/src/list.h similarity index 100% rename from det-yolov4-training/src/list.h rename to det-yolov4-tmi/src/list.h diff --git a/det-yolov4-training/src/local_layer.c b/det-yolov4-tmi/src/local_layer.c similarity index 100% rename from det-yolov4-training/src/local_layer.c rename to det-yolov4-tmi/src/local_layer.c diff --git a/det-yolov4-training/src/local_layer.h b/det-yolov4-tmi/src/local_layer.h similarity index 100% rename from det-yolov4-training/src/local_layer.h rename to det-yolov4-tmi/src/local_layer.h diff --git a/det-yolov4-training/src/lstm_layer.c b/det-yolov4-tmi/src/lstm_layer.c similarity index 100% rename from det-yolov4-training/src/lstm_layer.c rename to det-yolov4-tmi/src/lstm_layer.c diff --git a/det-yolov4-training/src/lstm_layer.h b/det-yolov4-tmi/src/lstm_layer.h similarity index 100% rename from det-yolov4-training/src/lstm_layer.h rename to det-yolov4-tmi/src/lstm_layer.h diff --git a/det-yolov4-training/src/matrix.c b/det-yolov4-tmi/src/matrix.c similarity index 100% rename from det-yolov4-training/src/matrix.c rename to det-yolov4-tmi/src/matrix.c diff --git a/det-yolov4-training/src/matrix.h b/det-yolov4-tmi/src/matrix.h similarity index 100% rename from det-yolov4-training/src/matrix.h rename to det-yolov4-tmi/src/matrix.h diff --git a/det-yolov4-training/src/maxpool_layer.c b/det-yolov4-tmi/src/maxpool_layer.c similarity index 100% rename from det-yolov4-training/src/maxpool_layer.c rename to det-yolov4-tmi/src/maxpool_layer.c diff --git a/det-yolov4-training/src/maxpool_layer.h b/det-yolov4-tmi/src/maxpool_layer.h similarity index 100% rename from det-yolov4-training/src/maxpool_layer.h rename to det-yolov4-tmi/src/maxpool_layer.h diff --git a/det-yolov4-training/src/maxpool_layer_kernels.cu b/det-yolov4-tmi/src/maxpool_layer_kernels.cu similarity index 100% rename from det-yolov4-training/src/maxpool_layer_kernels.cu rename to det-yolov4-tmi/src/maxpool_layer_kernels.cu diff --git a/det-yolov4-training/src/network.c b/det-yolov4-tmi/src/network.c similarity index 100% rename from det-yolov4-training/src/network.c rename to det-yolov4-tmi/src/network.c diff --git a/det-yolov4-training/src/network.h b/det-yolov4-tmi/src/network.h similarity index 100% rename from det-yolov4-training/src/network.h rename to det-yolov4-tmi/src/network.h diff --git a/det-yolov4-training/src/network_kernels.cu b/det-yolov4-tmi/src/network_kernels.cu similarity index 100% rename from det-yolov4-training/src/network_kernels.cu rename to det-yolov4-tmi/src/network_kernels.cu diff --git a/det-yolov4-training/src/nightmare.c b/det-yolov4-tmi/src/nightmare.c similarity index 100% rename from det-yolov4-training/src/nightmare.c rename to det-yolov4-tmi/src/nightmare.c diff --git a/det-yolov4-training/src/normalization_layer.c b/det-yolov4-tmi/src/normalization_layer.c similarity index 100% rename from det-yolov4-training/src/normalization_layer.c rename to det-yolov4-tmi/src/normalization_layer.c diff --git a/det-yolov4-training/src/normalization_layer.h b/det-yolov4-tmi/src/normalization_layer.h similarity index 100% rename from det-yolov4-training/src/normalization_layer.h rename to det-yolov4-tmi/src/normalization_layer.h diff --git a/det-yolov4-training/src/option_list.c b/det-yolov4-tmi/src/option_list.c similarity index 100% rename from det-yolov4-training/src/option_list.c rename to det-yolov4-tmi/src/option_list.c diff --git a/det-yolov4-training/src/option_list.h b/det-yolov4-tmi/src/option_list.h similarity index 100% rename from det-yolov4-training/src/option_list.h rename to det-yolov4-tmi/src/option_list.h diff --git a/det-yolov4-training/src/parser.c b/det-yolov4-tmi/src/parser.c similarity index 100% rename from det-yolov4-training/src/parser.c rename to det-yolov4-tmi/src/parser.c diff --git a/det-yolov4-training/src/parser.h b/det-yolov4-tmi/src/parser.h similarity index 100% rename from det-yolov4-training/src/parser.h rename to det-yolov4-tmi/src/parser.h diff --git a/det-yolov4-training/src/region_layer.c b/det-yolov4-tmi/src/region_layer.c similarity index 100% rename from det-yolov4-training/src/region_layer.c rename to det-yolov4-tmi/src/region_layer.c diff --git a/det-yolov4-training/src/region_layer.h b/det-yolov4-tmi/src/region_layer.h similarity index 100% rename from det-yolov4-training/src/region_layer.h rename to det-yolov4-tmi/src/region_layer.h diff --git a/det-yolov4-training/src/reorg_layer.c b/det-yolov4-tmi/src/reorg_layer.c similarity index 100% rename from det-yolov4-training/src/reorg_layer.c rename to det-yolov4-tmi/src/reorg_layer.c diff --git a/det-yolov4-training/src/reorg_layer.h b/det-yolov4-tmi/src/reorg_layer.h similarity index 100% rename from det-yolov4-training/src/reorg_layer.h rename to det-yolov4-tmi/src/reorg_layer.h diff --git a/det-yolov4-training/src/reorg_old_layer.c b/det-yolov4-tmi/src/reorg_old_layer.c similarity index 100% rename from det-yolov4-training/src/reorg_old_layer.c rename to det-yolov4-tmi/src/reorg_old_layer.c diff --git a/det-yolov4-training/src/reorg_old_layer.h b/det-yolov4-tmi/src/reorg_old_layer.h similarity index 100% rename from det-yolov4-training/src/reorg_old_layer.h rename to det-yolov4-tmi/src/reorg_old_layer.h diff --git a/det-yolov4-training/src/representation_layer.c b/det-yolov4-tmi/src/representation_layer.c similarity index 100% rename from det-yolov4-training/src/representation_layer.c rename to det-yolov4-tmi/src/representation_layer.c diff --git a/det-yolov4-training/src/representation_layer.h b/det-yolov4-tmi/src/representation_layer.h similarity index 100% rename from det-yolov4-training/src/representation_layer.h rename to det-yolov4-tmi/src/representation_layer.h diff --git a/det-yolov4-training/src/rnn.c b/det-yolov4-tmi/src/rnn.c similarity index 100% rename from det-yolov4-training/src/rnn.c rename to det-yolov4-tmi/src/rnn.c diff --git a/det-yolov4-training/src/rnn_layer.c b/det-yolov4-tmi/src/rnn_layer.c similarity index 100% rename from det-yolov4-training/src/rnn_layer.c rename to det-yolov4-tmi/src/rnn_layer.c diff --git a/det-yolov4-training/src/rnn_layer.h b/det-yolov4-tmi/src/rnn_layer.h similarity index 100% rename from det-yolov4-training/src/rnn_layer.h rename to det-yolov4-tmi/src/rnn_layer.h diff --git a/det-yolov4-training/src/rnn_vid.c b/det-yolov4-tmi/src/rnn_vid.c similarity index 100% rename from det-yolov4-training/src/rnn_vid.c rename to det-yolov4-tmi/src/rnn_vid.c diff --git a/det-yolov4-training/src/route_layer.c b/det-yolov4-tmi/src/route_layer.c similarity index 100% rename from det-yolov4-training/src/route_layer.c rename to det-yolov4-tmi/src/route_layer.c diff --git a/det-yolov4-training/src/route_layer.h b/det-yolov4-tmi/src/route_layer.h similarity index 100% rename from det-yolov4-training/src/route_layer.h rename to det-yolov4-tmi/src/route_layer.h diff --git a/det-yolov4-training/src/sam_layer.c b/det-yolov4-tmi/src/sam_layer.c similarity index 100% rename from det-yolov4-training/src/sam_layer.c rename to det-yolov4-tmi/src/sam_layer.c diff --git a/det-yolov4-training/src/sam_layer.h b/det-yolov4-tmi/src/sam_layer.h similarity index 100% rename from det-yolov4-training/src/sam_layer.h rename to det-yolov4-tmi/src/sam_layer.h diff --git a/det-yolov4-training/src/scale_channels_layer.c b/det-yolov4-tmi/src/scale_channels_layer.c similarity index 100% rename from det-yolov4-training/src/scale_channels_layer.c rename to det-yolov4-tmi/src/scale_channels_layer.c diff --git a/det-yolov4-training/src/scale_channels_layer.h b/det-yolov4-tmi/src/scale_channels_layer.h similarity index 100% rename from det-yolov4-training/src/scale_channels_layer.h rename to det-yolov4-tmi/src/scale_channels_layer.h diff --git a/det-yolov4-training/src/shortcut_layer.c b/det-yolov4-tmi/src/shortcut_layer.c similarity index 100% rename from det-yolov4-training/src/shortcut_layer.c rename to det-yolov4-tmi/src/shortcut_layer.c diff --git a/det-yolov4-training/src/shortcut_layer.h b/det-yolov4-tmi/src/shortcut_layer.h similarity index 100% rename from det-yolov4-training/src/shortcut_layer.h rename to det-yolov4-tmi/src/shortcut_layer.h diff --git a/det-yolov4-training/src/softmax_layer.c b/det-yolov4-tmi/src/softmax_layer.c similarity index 100% rename from det-yolov4-training/src/softmax_layer.c rename to det-yolov4-tmi/src/softmax_layer.c diff --git a/det-yolov4-training/src/softmax_layer.h b/det-yolov4-tmi/src/softmax_layer.h similarity index 100% rename from det-yolov4-training/src/softmax_layer.h rename to det-yolov4-tmi/src/softmax_layer.h diff --git a/det-yolov4-training/src/super.c b/det-yolov4-tmi/src/super.c similarity index 100% rename from det-yolov4-training/src/super.c rename to det-yolov4-tmi/src/super.c diff --git a/det-yolov4-training/src/swag.c b/det-yolov4-tmi/src/swag.c similarity index 100% rename from det-yolov4-training/src/swag.c rename to det-yolov4-tmi/src/swag.c diff --git a/det-yolov4-training/src/tag.c b/det-yolov4-tmi/src/tag.c similarity index 100% rename from det-yolov4-training/src/tag.c rename to det-yolov4-tmi/src/tag.c diff --git a/det-yolov4-training/src/tree.c b/det-yolov4-tmi/src/tree.c similarity index 100% rename from det-yolov4-training/src/tree.c rename to det-yolov4-tmi/src/tree.c diff --git a/det-yolov4-training/src/tree.h b/det-yolov4-tmi/src/tree.h similarity index 100% rename from det-yolov4-training/src/tree.h rename to det-yolov4-tmi/src/tree.h diff --git a/det-yolov4-training/src/upsample_layer.c b/det-yolov4-tmi/src/upsample_layer.c similarity index 100% rename from det-yolov4-training/src/upsample_layer.c rename to det-yolov4-tmi/src/upsample_layer.c diff --git a/det-yolov4-training/src/upsample_layer.h b/det-yolov4-tmi/src/upsample_layer.h similarity index 100% rename from det-yolov4-training/src/upsample_layer.h rename to det-yolov4-tmi/src/upsample_layer.h diff --git a/det-yolov4-training/src/utils.c b/det-yolov4-tmi/src/utils.c similarity index 100% rename from det-yolov4-training/src/utils.c rename to det-yolov4-tmi/src/utils.c diff --git a/det-yolov4-training/src/utils.h b/det-yolov4-tmi/src/utils.h similarity index 100% rename from det-yolov4-training/src/utils.h rename to det-yolov4-tmi/src/utils.h diff --git a/det-yolov4-training/src/version.h b/det-yolov4-tmi/src/version.h similarity index 100% rename from det-yolov4-training/src/version.h rename to det-yolov4-tmi/src/version.h diff --git a/det-yolov4-training/src/version.h.in b/det-yolov4-tmi/src/version.h.in similarity index 100% rename from det-yolov4-training/src/version.h.in rename to det-yolov4-tmi/src/version.h.in diff --git a/det-yolov4-training/src/voxel.c b/det-yolov4-tmi/src/voxel.c similarity index 100% rename from det-yolov4-training/src/voxel.c rename to det-yolov4-tmi/src/voxel.c diff --git a/det-yolov4-training/src/writing.c b/det-yolov4-tmi/src/writing.c similarity index 100% rename from det-yolov4-training/src/writing.c rename to det-yolov4-tmi/src/writing.c diff --git a/det-yolov4-training/src/yolo.c b/det-yolov4-tmi/src/yolo.c similarity index 100% rename from det-yolov4-training/src/yolo.c rename to det-yolov4-tmi/src/yolo.c diff --git a/det-yolov4-training/src/yolo_console_dll.cpp b/det-yolov4-tmi/src/yolo_console_dll.cpp similarity index 100% rename from det-yolov4-training/src/yolo_console_dll.cpp rename to det-yolov4-tmi/src/yolo_console_dll.cpp diff --git a/det-yolov4-training/src/yolo_layer.c b/det-yolov4-tmi/src/yolo_layer.c similarity index 100% rename from det-yolov4-training/src/yolo_layer.c rename to det-yolov4-tmi/src/yolo_layer.c diff --git a/det-yolov4-training/src/yolo_layer.h b/det-yolov4-tmi/src/yolo_layer.h similarity index 100% rename from det-yolov4-training/src/yolo_layer.h rename to det-yolov4-tmi/src/yolo_layer.h diff --git a/det-yolov4-training/src/yolo_v2_class.cpp b/det-yolov4-tmi/src/yolo_v2_class.cpp similarity index 100% rename from det-yolov4-training/src/yolo_v2_class.cpp rename to det-yolov4-tmi/src/yolo_v2_class.cpp diff --git a/det-yolov4-tmi/start.py b/det-yolov4-tmi/start.py new file mode 100644 index 0000000..67da850 --- /dev/null +++ b/det-yolov4-tmi/start.py @@ -0,0 +1,24 @@ +import logging +import subprocess +import sys + +import yaml + + +def start() -> int: + with open("/in/env.yaml", "r", encoding='utf8') as f: + config = yaml.safe_load(f) + + logging.info(f"config is {config}") + if config['run_training']: + cmd = 'bash /darknet/make_train_test_darknet.sh' + cwd = '/darknet' + else: + cmd = 'python3 docker_main.py' + cwd = '/darknet/mining' + subprocess.run(cmd, check=True, shell=True, cwd=cwd) + + return 0 + +if __name__ == '__main__': + sys.exit(start()) diff --git a/det-yolov4-training/train.sh b/det-yolov4-tmi/train.sh similarity index 100% rename from det-yolov4-training/train.sh rename to det-yolov4-tmi/train.sh diff --git a/det-yolov4-training/train_watcher.py b/det-yolov4-tmi/train_watcher.py similarity index 100% rename from det-yolov4-training/train_watcher.py rename to det-yolov4-tmi/train_watcher.py diff --git a/det-yolov4-training/train_yolov3.sh b/det-yolov4-tmi/train_yolov3.sh similarity index 100% rename from det-yolov4-training/train_yolov3.sh rename to det-yolov4-tmi/train_yolov3.sh diff --git a/det-yolov4-training/training-template.yaml b/det-yolov4-tmi/training-template.yaml similarity index 82% rename from det-yolov4-training/training-template.yaml rename to det-yolov4-tmi/training-template.yaml index 17c32f7..bb276dc 100644 --- a/det-yolov4-training/training-template.yaml +++ b/det-yolov4-tmi/training-template.yaml @@ -5,8 +5,9 @@ learning_rate: 0.0013 max_batches: 20000 warmup_iterations: 1000 batch: 64 -subdivisions: 32 -shm_size: '16G' +subdivisions: 64 +shm_size: '128G' +export_format: 'ark:raw' # class_names: # - cat # gpu_id: '0,1,2,3' diff --git a/det-yolov4-training/video_yolov3.sh b/det-yolov4-tmi/video_yolov3.sh similarity index 100% rename from det-yolov4-training/video_yolov3.sh rename to det-yolov4-tmi/video_yolov3.sh diff --git a/det-yolov4-training/video_yolov4.sh b/det-yolov4-tmi/video_yolov4.sh similarity index 100% rename from det-yolov4-training/video_yolov4.sh rename to det-yolov4-tmi/video_yolov4.sh diff --git a/det-yolov4-training/warm_up_training.py b/det-yolov4-tmi/warm_up_training.py similarity index 100% rename from det-yolov4-training/warm_up_training.py rename to det-yolov4-tmi/warm_up_training.py diff --git a/det-yolov5-tmi/.dockerignore b/det-yolov5-tmi/.dockerignore index af51ccc..9f34de6 100644 --- a/det-yolov5-tmi/.dockerignore +++ b/det-yolov5-tmi/.dockerignore @@ -12,8 +12,9 @@ data/samples/* *.jpg # Neural Network weights ----------------------------------------------------------------------------------------------- -**/*.pt +#**/*.pt **/*.pth +**/*.pkl **/*.onnx **/*.engine **/*.mlmodel diff --git a/det-yolov5-tmi/Dockerfile b/det-yolov5-tmi/Dockerfile deleted file mode 100644 index 489dd04..0000000 --- a/det-yolov5-tmi/Dockerfile +++ /dev/null @@ -1,64 +0,0 @@ -# YOLOv5 🚀 by Ultralytics, GPL-3.0 license - -# Start FROM Nvidia PyTorch image https://ngc.nvidia.com/catalog/containers/nvidia:pytorch -FROM nvcr.io/nvidia/pytorch:21.10-py3 - -# Install linux packages -RUN apt update && apt install -y zip htop screen libgl1-mesa-glx - -# Install python dependencies -COPY requirements.txt . -RUN python -m pip install --upgrade pip -RUN pip uninstall -y torch torchvision torchtext -RUN pip install --no-cache -r requirements.txt albumentations wandb gsutil notebook \ - torch==1.10.2+cu113 torchvision==0.11.3+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html -# RUN pip install --no-cache -U torch torchvision - -# Create working directory -RUN mkdir -p /usr/src/app -WORKDIR /usr/src/app - -# Copy contents -COPY . /usr/src/app - -# Downloads to user config dir -ADD https://ultralytics.com/assets/Arial.ttf /root/.config/Ultralytics/ - -# Set environment variables -# ENV HOME=/usr/src/app - - -# Usage Examples ------------------------------------------------------------------------------------------------------- - -# Build and Push -# t=ultralytics/yolov5:latest && sudo docker build -t $t . && sudo docker push $t - -# Pull and Run -# t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all $t - -# Pull and Run with local directory access -# t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/datasets:/usr/src/datasets $t - -# Kill all -# sudo docker kill $(sudo docker ps -q) - -# Kill all image-based -# sudo docker kill $(sudo docker ps -qa --filter ancestor=ultralytics/yolov5:latest) - -# Bash into running container -# sudo docker exec -it 5a9b5863d93d bash - -# Bash into stopped container -# id=$(sudo docker ps -qa) && sudo docker start $id && sudo docker exec -it $id bash - -# Clean up -# docker system prune -a --volumes - -# Update Ubuntu drivers -# https://www.maketecheasier.com/install-nvidia-drivers-ubuntu/ - -# DDP test -# python -m torch.distributed.run --nproc_per_node 2 --master_port 1 train.py --epochs 3 - -# GCP VM from Image -# docker.io/ultralytics/yolov5:latest diff --git a/det-yolov5-tmi/mining/mining_cald.py b/det-yolov5-tmi/mining/mining_cald.py deleted file mode 100644 index d93fb43..0000000 --- a/det-yolov5-tmi/mining/mining_cald.py +++ /dev/null @@ -1,145 +0,0 @@ -""" -Consistency-based Active Learning for Object Detection CVPR 2022 workshop -official code: https://github.com/we1pingyu/CALD/blob/master/cald_train.py -""" -import sys -from typing import Dict, List, Tuple - -import cv2 -import numpy as np -from nptyping import NDArray -from scipy.stats import entropy -from tqdm import tqdm -from ymir_exc import dataset_reader as dr -from ymir_exc import env, monitor -from ymir_exc import result_writer as rw - -from mining.data_augment import cutout, horizontal_flip, intersect, resize, rotate -from utils.ymir_yolov5 import BBOX, CV_IMAGE, YmirYolov5, YmirStage, get_ymir_process, get_merged_config - - -def split_result(result: NDArray) -> Tuple[BBOX, NDArray, NDArray]: - if len(result) > 0: - bboxes = result[:, :4].astype(np.int32) - conf = result[:, 4] - class_id = result[:, 5] - else: - bboxes = np.zeros(shape=(0, 4), dtype=np.int32) - conf = np.zeros(shape=(0, 1), dtype=np.float32) - class_id = np.zeros(shape=(0, 1), dtype=np.int32) - - return bboxes, conf, class_id - - -class MiningCald(YmirYolov5): - def mining(self) -> List: - N = dr.items_count(env.DatasetType.CANDIDATE) - monitor_gap = max(1, N // 100) - idx = -1 - beta = 1.3 - mining_result = [] - for asset_path, _ in tqdm(dr.item_paths(dataset_type=env.DatasetType.CANDIDATE)): - img = cv2.imread(asset_path) - # xyxy,conf,cls - result = self.predict(img) - bboxes, conf, _ = split_result(result) - if len(result) == 0: - # no result for the image without augmentation - mining_result.append((asset_path, -beta)) - continue - - consistency = 0.0 - aug_bboxes_dict, aug_results_dict = self.aug_predict(img, bboxes) - for key in aug_results_dict: - # no result for the image with augmentation f'{key}' - if len(aug_results_dict[key]) == 0: - consistency += beta - continue - - bboxes_key, conf_key, _ = split_result(aug_results_dict[key]) - cls_scores_aug = 1 - conf_key - cls_scores = 1 - conf - - consistency_per_aug = 2.0 - ious = get_ious(bboxes_key, aug_bboxes_dict[key]) - aug_idxs = np.argmax(ious, axis=0) - for origin_idx, aug_idx in enumerate(aug_idxs): - max_iou = ious[aug_idx, origin_idx] - if max_iou == 0: - consistency_per_aug = min(consistency_per_aug, beta) - p = cls_scores_aug[aug_idx] - q = cls_scores[origin_idx] - m = (p + q) / 2. - js = 0.5 * entropy(p, m) + 0.5 * entropy(q, m) - if js < 0: - js = 0 - consistency_box = max_iou - consistency_cls = 0.5 * (conf[origin_idx] + conf_key[aug_idx]) * (1 - js) - consistency_per_inst = abs(consistency_box + consistency_cls - beta) - consistency_per_aug = min(consistency_per_aug, consistency_per_inst.item()) - - consistency += consistency_per_aug - - consistency /= len(aug_results_dict) - - mining_result.append((asset_path, consistency)) - idx += 1 - - if idx % monitor_gap == 0: - percent = get_ymir_process(stage=YmirStage.TASK, p=idx / N) - monitor.write_monitor_logger(percent=percent) - - return mining_result - - def aug_predict(self, image: CV_IMAGE, bboxes: BBOX) -> Tuple[Dict[str, BBOX], Dict[str, NDArray]]: - """ - for different augmentation methods: flip, cutout, rotate and resize - augment the image and bbox and use model to predict them. - - return the predict result and augment bbox. - """ - aug_dict = dict(flip=horizontal_flip, - cutout=cutout, - rotate=rotate, - resize=resize) - - aug_bboxes = dict() - aug_results = dict() - for key in aug_dict: - aug_img, aug_bbox = aug_dict[key](image, bboxes) - - aug_result = self.predict(aug_img) - aug_bboxes[key] = aug_bbox - aug_results[key] = aug_result - - return aug_bboxes, aug_results - - -def get_ious(boxes1: BBOX, boxes2: BBOX) -> NDArray: - """ - args: - boxes1: np.array, (N, 4), xyxy - boxes2: np.array, (M, 4), xyxy - return: - iou: np.array, (N, M) - """ - area1 = (boxes1[:, 2] - boxes1[:, 0]) * (boxes1[:, 3] - boxes1[:, 1]) - area2 = (boxes2[:, 2] - boxes2[:, 0]) * (boxes2[:, 3] - boxes2[:, 1]) - iner_area = intersect(boxes1, boxes2) - area1 = area1.reshape(-1, 1).repeat(area2.shape[0], axis=1) - area2 = area2.reshape(1, -1).repeat(area1.shape[0], axis=0) - iou = iner_area / (area1 + area2 - iner_area + 1e-14) - return iou - - -def main(): - cfg = get_merged_config() - miner = MiningCald(cfg) - mining_result = miner.mining() - rw.write_mining_result(mining_result=mining_result) - - return 0 - - -if __name__ == "__main__": - sys.exit(main()) diff --git a/det-yolov5-tmi/models/common.py b/det-yolov5-tmi/models/common.py index d116aa5..35bbc69 100644 --- a/det-yolov5-tmi/models/common.py +++ b/det-yolov5-tmi/models/common.py @@ -5,6 +5,7 @@ import json import math +import os import platform import warnings from collections import OrderedDict, namedtuple @@ -41,7 +42,17 @@ def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, k super().__init__() self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) self.bn = nn.BatchNorm2d(c2) - self.act = nn.Hardswish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) + + activation = os.environ.get('ACTIVATION', None) + if activation is None: + self.act = nn.Hardswish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) + else: + if activation.lower() == 'relu': + custom_act = nn.ReLU() + else: + warnings.warn(f'unknown activation {activation}, use Hardswish instead') + custom_act = nn.Hardswish() + self.act = custom_act if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) def forward(self, x): return self.act(self.bn(self.conv(x))) @@ -115,7 +126,15 @@ def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, nu self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False) self.cv4 = Conv(2 * c_, c2, 1, 1) self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3) - self.act = nn.SiLU() + activation = os.environ.get('ACTIVATION', None) + if activation is None: + self.act = nn.SiLU() + else: + if activation.lower() == 'relu': + self.act = nn.ReLU() + else: + warnings.warn(f'unknown activation {activation}, use SiLU instead') + self.act = nn.SiLU() self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n))) def forward(self, x): @@ -227,11 +246,12 @@ class GhostBottleneck(nn.Module): def __init__(self, c1, c2, k=3, s=1): # ch_in, ch_out, kernel, stride super().__init__() c_ = c2 // 2 - self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw - DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw - GhostConv(c_, c2, 1, 1, act=False)) # pw-linear - self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), - Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity() + self.conv = nn.Sequential( + GhostConv(c1, c_, 1, 1), # pw + DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw + GhostConv(c_, c2, 1, 1, act=False)) # pw-linear + self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), Conv(c1, c2, 1, 1, + act=False)) if s == 2 else nn.Identity() def forward(self, x): return self.conv(x) + self.shortcut(x) @@ -260,9 +280,9 @@ def __init__(self, gain=2): def forward(self, x): b, c, h, w = x.size() # assert C / s ** 2 == 0, 'Indivisible gain' s = self.gain - x = x.view(b, s, s, c // s ** 2, h, w) # x(1,2,2,16,80,80) + x = x.view(b, s, s, c // s**2, h, w) # x(1,2,2,16,80,80) x = x.permute(0, 3, 4, 1, 5, 2).contiguous() # x(1,16,80,2,80,2) - return x.view(b, c // s ** 2, h * s, w * s) # x(1,16,160,160) + return x.view(b, c // s**2, h * s, w * s) # x(1,16,160,160) class Concat(nn.Module): @@ -315,7 +335,7 @@ def __init__(self, weights='yolov5s.pt', device=None, dnn=False, data=None): stride, names = int(d['stride']), d['names'] elif dnn: # ONNX OpenCV DNN LOGGER.info(f'Loading {w} for ONNX OpenCV DNN inference...') - check_requirements(('opencv-python>=4.5.4',)) + check_requirements(('opencv-python>=4.5.4', )) net = cv2.dnn.readNetFromONNX(w) elif onnx: # ONNX Runtime LOGGER.info(f'Loading {w} for ONNX Runtime inference...') @@ -326,7 +346,7 @@ def __init__(self, weights='yolov5s.pt', device=None, dnn=False, data=None): session = onnxruntime.InferenceSession(w, providers=providers) elif xml: # OpenVINO LOGGER.info(f'Loading {w} for OpenVINO inference...') - check_requirements(('openvino-dev',)) # requires openvino-dev: https://pypi.org/project/openvino-dev/ + check_requirements(('openvino-dev', )) # requires openvino-dev: https://pypi.org/project/openvino-dev/ import openvino.inference_engine as ie core = ie.IECore() if not Path(w).is_file(): # if not *.xml @@ -381,9 +401,11 @@ def wrap_frozen_graph(gd, inputs, outputs): Interpreter, load_delegate = tf.lite.Interpreter, tf.lite.experimental.load_delegate, if edgetpu: # Edge TPU https://coral.ai/software/#edgetpu-runtime LOGGER.info(f'Loading {w} for TensorFlow Lite Edge TPU inference...') - delegate = {'Linux': 'libedgetpu.so.1', - 'Darwin': 'libedgetpu.1.dylib', - 'Windows': 'edgetpu.dll'}[platform.system()] + delegate = { + 'Linux': 'libedgetpu.so.1', + 'Darwin': 'libedgetpu.1.dylib', + 'Windows': 'edgetpu.dll' + }[platform.system()] interpreter = Interpreter(model_path=w, experimental_delegates=[load_delegate(delegate)]) else: # Lite LOGGER.info(f'Loading {w} for TensorFlow Lite inference...') @@ -554,8 +576,13 @@ def forward(self, imgs, size=640, augment=False, profile=False): t.append(time_sync()) # Post-process - y = non_max_suppression(y if self.dmb else y[0], self.conf, iou_thres=self.iou, classes=self.classes, - agnostic=self.agnostic, multi_label=self.multi_label, max_det=self.max_det) # NMS + y = non_max_suppression(y if self.dmb else y[0], + self.conf, + iou_thres=self.iou, + classes=self.classes, + agnostic=self.agnostic, + multi_label=self.multi_label, + max_det=self.max_det) # NMS for i in range(n): scale_coords(shape1, y[i][:, :4], shape0[i]) @@ -596,8 +623,13 @@ def display(self, pprint=False, show=False, save=False, crop=False, render=False label = f'{self.names[int(cls)]} {conf:.2f}' if crop: file = save_dir / 'crops' / self.names[int(cls)] / self.files[i] if save else None - crops.append({'box': box, 'conf': conf, 'cls': cls, 'label': label, - 'im': save_one_box(box, im, file=file, save=save)}) + crops.append({ + 'box': box, + 'conf': conf, + 'cls': cls, + 'label': label, + 'im': save_one_box(box, im, file=file, save=save) + }) else: # all others annotator.box_label(box, label, color=colors(cls)) im = annotator.im diff --git a/det-yolov5-tmi/models/experimental.py b/det-yolov5-tmi/models/experimental.py index 463e551..dbfecbf 100644 --- a/det-yolov5-tmi/models/experimental.py +++ b/det-yolov5-tmi/models/experimental.py @@ -2,6 +2,7 @@ """ Experimental modules """ +import os import math import numpy as np @@ -10,6 +11,7 @@ from models.common import Conv from utils.downloads import attempt_download +import warnings class CrossConv(nn.Module): @@ -59,14 +61,22 @@ def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True): # ch_in, ch_out, kern b = [c2] + [0] * n a = np.eye(n + 1, n, k=-1) a -= np.roll(a, 1, axis=1) - a *= np.array(k) ** 2 + a *= np.array(k)**2 a[0] = 1 c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b self.m = nn.ModuleList( [nn.Conv2d(c1, int(c_), k, s, k // 2, groups=math.gcd(c1, int(c_)), bias=False) for k, c_ in zip(k, c_)]) self.bn = nn.BatchNorm2d(c2) - self.act = nn.SiLU() + activation = os.environ.get('ACTIVATION', None) + if activation is None: + self.act = nn.SiLU() + else: + if activation.lower() == 'relu': + self.act = nn.ReLU() + else: + warnings.warn(f'unknown activation {activation}, use SiLU instead') + self.act = nn.SiLU() def forward(self, x): return self.act(self.bn(torch.cat([m(x) for m in self.m], 1))) diff --git a/det-yolov5-tmi/mypy.ini b/det-yolov5-tmi/mypy.ini index 85e751a..6a356a3 100644 --- a/det-yolov5-tmi/mypy.ini +++ b/det-yolov5-tmi/mypy.ini @@ -1,8 +1,7 @@ [mypy] ignore_missing_imports = True disallow_untyped_defs = False -files = [mining/*.py, utils/ymir_yolov5.py, start.py, train.py] -exclude = [utils/general.py] +exclude = [utils/general.py, models/*.py, utils/*.py] [mypy-torch.*] ignore_errors = True diff --git a/det-yolov5-tmi/start.py b/det-yolov5-tmi/start.py deleted file mode 100644 index fba6632..0000000 --- a/det-yolov5-tmi/start.py +++ /dev/null @@ -1,133 +0,0 @@ -import logging -import os -import os.path as osp -import shutil -import subprocess -import sys - -import cv2 -from easydict import EasyDict as edict -from ymir_exc import dataset_reader as dr -from ymir_exc import env, monitor -from ymir_exc import result_writer as rw - -from utils.ymir_yolov5 import (YmirStage, YmirYolov5, convert_ymir_to_yolov5, download_weight_file, get_merged_config, - get_weight_file, get_ymir_process) - - -def start() -> int: - cfg = get_merged_config() - - logging.info(f'merged config: {cfg}') - - if cfg.ymir.run_training: - _run_training(cfg) - else: - if cfg.ymir.run_mining: - _run_mining(cfg) - if cfg.ymir.run_infer: - _run_infer(cfg) - - return 0 - - -def _run_training(cfg: edict) -> None: - """ - function for training task - 1. convert dataset - 2. training model - 3. save model weight/hyperparameter/... to design directory - """ - # 1. convert dataset - out_dir = cfg.ymir.output.root_dir - convert_ymir_to_yolov5(cfg) - logging.info(f'generate {out_dir}/data.yaml') - monitor.write_monitor_logger(percent=get_ymir_process(stage=YmirStage.PREPROCESS, p=1.0)) - - # 2. training model - epochs = cfg.param.epochs - batch_size = cfg.param.batch_size - model = cfg.param.model - img_size = cfg.param.img_size - save_period = cfg.param.save_period - args_options = cfg.param.args_options - weights = get_weight_file(cfg) - if not weights: - # download pretrained weight - weights = download_weight_file(model) - - models_dir = cfg.ymir.output.models_dir - command = f'python3 train.py --epochs {epochs} ' + \ - f'--batch-size {batch_size} --data {out_dir}/data.yaml --project /out ' + \ - f'--cfg models/{model}.yaml --name models --weights {weights} ' + \ - f'--img-size {img_size} ' + \ - f'--save-period {save_period}' - if args_options: - command += f" {args_options}" - - logging.info(f'start training: {command}') - - subprocess.run(command.split(), check=True) - monitor.write_monitor_logger(percent=get_ymir_process(stage=YmirStage.TASK, p=1.0)) - - # 3. convert to onnx and save model weight to design directory - opset = cfg.param.opset - command = f'python3 export.py --weights {models_dir}/best.pt --opset {opset} --include onnx' - logging.info(f'export onnx weight: {command}') - subprocess.run(command.split(), check=True) - - # save hyperparameter - shutil.copy(f'models/{model}.yaml', f'{models_dir}/{model}.yaml') - - # if task done, write 100% percent log - monitor.write_monitor_logger(percent=1.0) - - -def _run_mining(cfg: edict()) -> None: - # generate data.yaml for mining - out_dir = cfg.ymir.output.root_dir - convert_ymir_to_yolov5(cfg) - logging.info(f'generate {out_dir}/data.yaml') - monitor.write_monitor_logger(percent=get_ymir_process(stage=YmirStage.PREPROCESS, p=1.0)) - - command = 'python3 mining/mining_cald.py' - logging.info(f'mining: {command}') - subprocess.run(command.split(), check=True) - monitor.write_monitor_logger(percent=1.0) - - -def _run_infer(cfg: edict) -> None: - # generate data.yaml for infer - out_dir = cfg.ymir.output.root_dir - convert_ymir_to_yolov5(cfg) - logging.info(f'generate {out_dir}/data.yaml') - monitor.write_monitor_logger(percent=get_ymir_process(stage=YmirStage.PREPROCESS, p=1.0)) - - N = dr.items_count(env.DatasetType.CANDIDATE) - infer_result = dict() - model = YmirYolov5(cfg) - idx = -1 - - monitor_gap = max(1, N // 100) - for asset_path, _ in dr.item_paths(dataset_type=env.DatasetType.CANDIDATE): - img = cv2.imread(asset_path) - result = model.infer(img) - infer_result[asset_path] = result - idx += 1 - - if idx % monitor_gap == 0: - percent = get_ymir_process(stage=YmirStage.TASK, p=idx / N) - monitor.write_monitor_logger(percent=percent) - - rw.write_infer_result(infer_result=infer_result) - monitor.write_monitor_logger(percent=1.0) - - -if __name__ == '__main__': - logging.basicConfig(stream=sys.stdout, - format='%(levelname)-8s: [%(asctime)s] %(message)s', - datefmt='%Y%m%d-%H:%M:%S', - level=logging.INFO) - - os.environ.setdefault('PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION', 'python') - sys.exit(start()) diff --git a/det-yolov5-tmi/train.py b/det-yolov5-tmi/train.py index 7fcbbce..54fd2e8 100644 --- a/det-yolov5-tmi/train.py +++ b/det-yolov5-tmi/train.py @@ -21,6 +21,7 @@ from copy import deepcopy from datetime import datetime from pathlib import Path +import subprocess import numpy as np import torch @@ -31,6 +32,7 @@ from torch.nn.parallel import DistributedDataParallel as DDP from torch.optim import SGD, Adam, AdamW, lr_scheduler from tqdm import tqdm +from ymir_exc import monitor FILE = Path(__file__).resolve() ROOT = FILE.parents[0] # YOLOv5 root directory @@ -38,6 +40,8 @@ sys.path.append(str(ROOT)) # add ROOT to PATH ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative +from ymir_exc.util import YmirStage, get_merged_config, get_ymir_process, write_ymir_training_result + import val # for end-of-epoch mAP from models.experimental import attempt_load from models.yolo import Model @@ -47,17 +51,15 @@ from utils.datasets import create_dataloader from utils.downloads import attempt_download from utils.general import (LOGGER, check_dataset, check_file, check_git_status, check_img_size, check_requirements, - check_suffix, check_version, check_yaml, colorstr, get_latest_run, increment_path, init_seeds, - intersect_dicts, labels_to_class_weights, labels_to_image_weights, methods, one_cycle, - print_args, print_mutation, strip_optimizer) + check_suffix, check_version, check_yaml, colorstr, get_latest_run, increment_path, + init_seeds, intersect_dicts, labels_to_class_weights, labels_to_image_weights, methods, + one_cycle, print_args, print_mutation, strip_optimizer) from utils.loggers import Loggers from utils.loggers.wandb.wandb_utils import check_wandb_resume from utils.loss import ComputeLoss from utils.metrics import fitness from utils.plots import plot_evolve, plot_labels from utils.torch_utils import EarlyStopping, ModelEMA, de_parallel, select_device, torch_distributed_zero_first -from utils.ymir_yolov5 import write_ymir_training_result, YmirStage, get_ymir_process, get_merged_config -from ymir_exc import monitor LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html RANK = int(os.getenv('RANK', -1)) @@ -73,7 +75,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary Path(opt.save_dir), opt.epochs, opt.batch_size, opt.weights, opt.single_cls, opt.evolve, opt.data, opt.cfg, \ opt.resume, opt.noval, opt.nosave, opt.workers, opt.freeze ymir_cfg = opt.ymir_cfg - opt.ymir_cfg = '' # yaml cannot dump edict, remove it here + opt.ymir_cfg = '' # yaml cannot dump edict, remove it here log_dir = Path(ymir_cfg.ymir.output.tensorboard_dir) # Directories @@ -184,7 +186,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary if opt.cos_lr: lf = one_cycle(1, hyp['lrf'], epochs) # cosine 1->hyp['lrf'] else: - lf = lambda x: (1 - x / epochs) * (1.0 - hyp['lrf']) + hyp['lrf'] # linear + def lf(x): return (1 - x / epochs) * (1.0 - hyp['lrf']) + hyp['lrf'] # linear scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf) # plot_lr_scheduler(optimizer, scheduler, epochs) # EMA @@ -206,7 +208,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary # Epochs start_epoch = ckpt['epoch'] + 1 if resume: - assert start_epoch > 0, f'{weights} training to {epochs} epochs is finished, nothing to resume.' + assert start_epoch > 0, f'{weights} training from {start_epoch} to {epochs} epochs is finished, nothing to resume.' if epochs < start_epoch: LOGGER.info(f"{weights} has been trained for {ckpt['epoch']} epochs. Fine-tuning for {epochs} more epochs.") epochs += ckpt['epoch'] # finetune additional epochs @@ -296,7 +298,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary # ymir monitor if epoch % monitor_gap == 0: - percent = get_ymir_process(stage=YmirStage.TASK, p=epoch/(epochs-start_epoch+1)) + percent = get_ymir_process(stage=YmirStage.TASK, p=(epoch - start_epoch + 1) / (epochs - start_epoch + 1)) monitor.write_monitor_logger(percent=percent) # Update image weights (optional, single-GPU only) @@ -401,7 +403,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary callbacks.run('on_fit_epoch_end', log_vals, epoch, best_fitness, fi) # Save model - if (not nosave) or (final_epoch and not evolve): # if save + if (not nosave) or (best_fitness == fi) or (final_epoch and not evolve): # if save ckpt = {'epoch': epoch, 'best_fitness': best_fitness, 'model': deepcopy(de_parallel(model)).half(), @@ -415,10 +417,11 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary torch.save(ckpt, last) if best_fitness == fi: torch.save(ckpt, best) - if (epoch > 0) and (opt.save_period > 0) and (epoch % opt.save_period == 0): + write_ymir_training_result(ymir_cfg, map50=best_fitness, id='best', files=[str(best)]) + if (not nosave) and (epoch > 0) and (opt.save_period > 0) and (epoch % opt.save_period == 0): torch.save(ckpt, w / f'epoch{epoch}.pt') weight_file = str(w / f'epoch{epoch}.pt') - write_ymir_training_result(ymir_cfg, map50=results[2], epoch=epoch, weight_file=weight_file) + write_ymir_training_result(ymir_cfg, map50=results[2], id=f'epoch_{epoch}', files=[weight_file]) del ckpt callbacks.run('on_model_save', last, epoch, final_epoch, best_fitness, fi) @@ -426,7 +429,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary if RANK == -1 and stopper(epoch=epoch, fitness=fi): break - # Stop DDP TODO: known issues shttps://github.com/ultralytics/yolov5/pull/4576 + # Stop DDP TODO: known issues https://github.com/ultralytics/yolov5/pull/4576 # stop = stopper(epoch=epoch, fitness=fi) # if RANK == 0: # dist.broadcast_object_list([stop], 0) # broadcast 'stop' to all ranks @@ -464,9 +467,20 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary callbacks.run('on_train_end', last, best, plots, epoch, results) LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}") + opset = ymir_cfg.param.opset + onnx_file: Path = best.with_suffix('.onnx') + command = f'python3 export.py --weights {best} --opset {opset} --include onnx' + LOGGER.info(f'export onnx weight: {command}') + subprocess.run(command.split(), check=True) + + if nosave: + # save best.pt and best.onnx + write_ymir_training_result(ymir_cfg, map50=best_fitness, id='best', files=[str(best), str(onnx_file)]) + else: + # set files = [] to save all files in /out/models + write_ymir_training_result(ymir_cfg, map50=best_fitness, id='best', files=[]) + torch.cuda.empty_cache() - # save the best and last weight file with other files in models_dir - write_ymir_training_result(ymir_cfg, map50=best_fitness, epoch=epochs, weight_file='') return results @@ -522,12 +536,17 @@ def main(opt, callbacks=Callbacks()): check_git_status() check_requirements(exclude=['thop']) + ymir_cfg = get_merged_config() # Resume if opt.resume and not check_wandb_resume(opt) and not opt.evolve: # resume an interrupted run - ckpt = opt.resume if isinstance(opt.resume, str) else get_latest_run() # specified or most recent path + ckpt = opt.resume if isinstance(opt.resume, str) else get_latest_run(ymir_cfg.ymir.input.root_dir) # specified or most recent path assert os.path.isfile(ckpt), 'ERROR: --resume checkpoint does not exist' - with open(Path(ckpt).parent.parent / 'opt.yaml', errors='ignore') as f: - opt = argparse.Namespace(**yaml.safe_load(f)) # replace + + opt_file = Path(ckpt).parent / 'opt.yaml' + if opt_file.exists(): + with open(opt_file, errors='ignore') as f: + opt = argparse.Namespace(**yaml.safe_load(f)) # replace + os.makedirs(opt.save_dir, exist_ok=True) opt.cfg, opt.weights, opt.resume = '', ckpt, True # reinstate LOGGER.info(f'Resuming training from {ckpt}') else: @@ -538,9 +557,8 @@ def main(opt, callbacks=Callbacks()): if opt.project == str(ROOT / 'runs/train'): # if default project name, rename to runs/evolve opt.project = str(ROOT / 'runs/evolve') opt.save_dir = str(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok)) - ymir_cfg = get_merged_config() - opt.ymir_cfg = ymir_cfg + opt.ymir_cfg = ymir_cfg # DDP mode device = select_device(opt.device, batch_size=opt.batch_size) @@ -558,9 +576,6 @@ def main(opt, callbacks=Callbacks()): # Train if not opt.evolve: train(opt.hyp, opt, device, callbacks) - if WORLD_SIZE > 1 and RANK == 0: - LOGGER.info('Destroying process group... ') - dist.destroy_process_group() # Evolve hyperparameters (optional) else: diff --git a/det-yolov5-tmi/utils/ymir_yolov5.py b/det-yolov5-tmi/utils/ymir_yolov5.py deleted file mode 100644 index 492822f..0000000 --- a/det-yolov5-tmi/utils/ymir_yolov5.py +++ /dev/null @@ -1,232 +0,0 @@ -""" -utils function for ymir and yolov5 -""" -import glob -import os.path as osp -import shutil -from enum import IntEnum -from typing import Any, List, Tuple - -import numpy as np -import torch -import yaml -from easydict import EasyDict as edict -from nptyping import NDArray, Shape, UInt8 -from ymir_exc import env -from ymir_exc import result_writer as rw - -from models.common import DetectMultiBackend -from models.experimental import attempt_download -from utils.augmentations import letterbox -from utils.general import check_img_size, non_max_suppression, scale_coords -from utils.torch_utils import select_device - - -class YmirStage(IntEnum): - PREPROCESS = 1 # convert dataset - TASK = 2 # training/mining/infer - POSTPROCESS = 3 # export model - - -BBOX = NDArray[Shape['*,4'], Any] -CV_IMAGE = NDArray[Shape['*,*,3'], UInt8] - - -def get_ymir_process(stage: YmirStage, p: float) -> float: - # const value for ymir process - PREPROCESS_PERCENT = 0.1 - TASK_PERCENT = 0.8 - POSTPROCESS_PERCENT = 0.1 - - if p < 0 or p > 1.0: - raise Exception(f'p not in [0,1], p={p}') - - if stage == YmirStage.PREPROCESS: - return PREPROCESS_PERCENT * p - elif stage == YmirStage.TASK: - return PREPROCESS_PERCENT + TASK_PERCENT * p - elif stage == YmirStage.POSTPROCESS: - return PREPROCESS_PERCENT + TASK_PERCENT + POSTPROCESS_PERCENT * p - else: - raise NotImplementedError(f'unknown stage {stage}') - - -def get_merged_config() -> edict: - """ - merge ymir_config and executor_config - """ - merged_cfg = edict() - # the hyperparameter information - merged_cfg.param = env.get_executor_config() - - # the ymir path information - merged_cfg.ymir = env.get_current_env() - return merged_cfg - - -def get_weight_file(cfg: edict) -> str: - """ - return the weight file path by priority - find weight file in cfg.param.model_params_path or cfg.param.model_params_path - """ - if cfg.ymir.run_training: - model_params_path = cfg.param.get('pretrained_model_params',[]) - else: - model_params_path = cfg.param.model_params_path - - model_dir = osp.join(cfg.ymir.input.root_dir, - cfg.ymir.input.models_dir) - model_params_path = [p for p in model_params_path if osp.exists(osp.join(model_dir, p))] - - # choose weight file by priority, best.pt > xxx.pt - if 'best.pt' in model_params_path: - return osp.join(model_dir, 'best.pt') - else: - for f in model_params_path: - if f.endswith('.pt'): - return osp.join(model_dir, f) - - return "" - - -def download_weight_file(model_name): - weights = attempt_download(f'{model_name}.pt') - return weights - - -class YmirYolov5(): - """ - used for mining and inference to init detector and predict. - """ - - def __init__(self, cfg: edict): - self.cfg = cfg - device = select_device(cfg.param.get('gpu_id', 'cpu')) - - self.model = self.init_detector(device) - self.device = device - self.class_names = cfg.param.class_names - self.stride = self.model.stride - self.conf_thres = float(cfg.param.conf_thres) - self.iou_thres = float(cfg.param.iou_thres) - - img_size = int(cfg.param.img_size) - imgsz = (img_size, img_size) - imgsz = check_img_size(imgsz, s=self.stride) - - self.model.warmup(imgsz=(1, 3, *imgsz), half=False) # warmup - self.img_size = imgsz - - def init_detector(self, device: torch.device) -> DetectMultiBackend: - weights = get_weight_file(self.cfg) - - data_yaml = osp.join(self.cfg.ymir.output.root_dir, 'data.yaml') - model = DetectMultiBackend(weights=weights, - device=device, - dnn=False, # not use opencv dnn for onnx inference - data=data_yaml) # dataset.yaml path - - return model - - def predict(self, img: CV_IMAGE) -> NDArray: - """ - predict single image and return bbox information - img: opencv BGR, uint8 format - """ - # preprocess: padded resize - img1 = letterbox(img, self.img_size, stride=self.stride, auto=True)[0] - - # preprocess: convert data format - img1 = img1.transpose((2, 0, 1))[::-1] # HWC to CHW, BGR to RGB - img1 = np.ascontiguousarray(img1) - img1 = torch.from_numpy(img1).to(self.device) - - img1 = img1 / 255 # 0 - 255 to 0.0 - 1.0 - img1.unsqueeze_(dim=0) # expand for batch dim - pred = self.model(img1) - - # postprocess - conf_thres = self.conf_thres - iou_thres = self.iou_thres - classes = None # not filter class_idx in results - agnostic_nms = False - max_det = 1000 - - pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det) - - result = [] - for det in pred: - if len(det): - # Rescale boxes from img_size to img size - det[:, :4] = scale_coords(img1.shape[2:], det[:, :4], img.shape).round() - result.append(det) - - # xyxy, conf, cls - if len(result) > 0: - tensor_result = torch.cat(result, dim=0) - numpy_result = tensor_result.data.cpu().numpy() - else: - numpy_result = np.zeros(shape=(0, 6), dtype=np.float32) - - return numpy_result - - def infer(self, img: CV_IMAGE) -> List[rw.Annotation]: - anns = [] - result = self.predict(img) - - for i in range(result.shape[0]): - xmin, ymin, xmax, ymax, conf, cls = result[i, :6].tolist() - ann = rw.Annotation(class_name=self.class_names[int(cls)], score=conf, box=rw.Box( - x=int(xmin), y=int(ymin), w=int(xmax - xmin), h=int(ymax - ymin))) - - anns.append(ann) - - return anns - - -def convert_ymir_to_yolov5(cfg: edict) -> None: - """ - convert ymir format dataset to yolov5 format - generate data.yaml for training/mining/infer - """ - - data = dict(path=cfg.ymir.output.root_dir, - nc=len(cfg.param.class_names), - names=cfg.param.class_names) - for split, prefix in zip(['train', 'val', 'test'], ['training', 'val', 'candidate']): - src_file = getattr(cfg.ymir.input, f'{prefix}_index_file') - if osp.exists(src_file): - shutil.copy(src_file, f'{cfg.ymir.output.root_dir}/{split}.tsv') - - data[split] = f'{split}.tsv' - - with open(osp.join(cfg.ymir.output.root_dir, 'data.yaml'), 'w') as fw: - fw.write(yaml.safe_dump(data)) - - -def write_ymir_training_result(cfg: edict, - map50: float, - epoch: int, - weight_file: str) -> int: - """ - cfg: ymir config - results: (mp, mr, map50, map, loss) - maps: map@0.5:0.95 for all classes - epoch: stage - weight_file: saved weight files, empty weight_file will save all files - """ - model = cfg.param.model - # use `rw.write_training_result` to save training result - if weight_file: - rw.write_model_stage(stage_name=f"{model}_{epoch}", - files=[osp.basename(weight_file)], - mAP=float(map50)) - else: - # save other files with - files = [osp.basename(f) for f in glob.glob(osp.join(cfg.ymir.output.models_dir, '*')) - if not f.endswith('.pt')] + ['last.pt', 'best.pt'] - - rw.write_model_stage(stage_name=f"{model}_last_and_best", - files=files, - mAP=float(map50)) - return 0 diff --git a/det-yolov5-tmi/ymir/README.md b/det-yolov5-tmi/ymir/README.md new file mode 100644 index 0000000..1936a93 --- /dev/null +++ b/det-yolov5-tmi/ymir/README.md @@ -0,0 +1,43 @@ +# yolov5-ymir readme +- [yolov5 readme](./README_yolov5.md) + +``` +docker build -t ymir/ymir-executor:ymir1.1.0-cuda102-yolov5-tmi --build-arg SERVER_MODE=dev --build-arg YMIR=1.1.0 -f cuda102.dockerfile . + +docker build -t ymir/ymir-executor:ymir1.1.0-cuda111-yolov5-tmi --build-arg SERVER_MODE=dev --build-arg YMIR=1.1.0 -f cuda111.dockerfile . +``` + +## main change log + +- add `start.py` and `ymir/ymir_yolov5.py` for train/infer/mining + +- add `ymir/ymir_yolov5.py` for useful functions + + - `get_merged_config()` add ymir path config `cfg.yaml` and hyper-parameter `cfg.param` + + - `convert_ymir_to_yolov5()` generate yolov5 dataset config file `data.yaml` + + - `write_ymir_training_result()` save model weight, map and other files. + + - `get_weight_file()` get pretrained weight or init weight file from ymir system + +- modify `utils/datasets.py` for ymir dataset format + +- modify `train.py` for training process monitor + +- add `mining/data_augment.py` and `mining/mining_cald.py` for mining + +- add `training/infer/mining-template.yaml` for `/img-man/training/infer/mining-template.yaml` + +- add `cuda102/111.dockerfile`, remove origin `Dockerfile` + +- modify `requirements.txt` + +- other modify support onnx export, not important. + +## new features + +- 2022/09/08: add aldd active learning algorithm for mining task. [Active Learning for Deep Detection Neural Networks (ICCV 2019)](https://gitlab.com/haghdam/deep_active_learning) +- 2022/09/14: support change hyper-parameter `num_workers_per_gpu` +- 2022/09/16: support change activation, view [rknn](https://github.com/airockchip/rknn_model_zoo/tree/main/models/vision/object_detection/yolov5-pytorch) +- 2022/10/09: fix dist.destroy_process_group() hang diff --git a/det-yolov5-tmi/cuda102.dockerfile b/det-yolov5-tmi/ymir/docker/cuda102.dockerfile similarity index 74% rename from det-yolov5-tmi/cuda102.dockerfile rename to det-yolov5-tmi/ymir/docker/cuda102.dockerfile index 49a29d3..0014b60 100644 --- a/det-yolov5-tmi/cuda102.dockerfile +++ b/det-yolov5-tmi/ymir/docker/cuda102.dockerfile @@ -3,28 +3,27 @@ ARG CUDA="10.2" ARG CUDNN="7" FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-runtime -ARG SERVER_MODE=prod +# support YMIR=1.0.0, 1.1.0 or 1.2.0 +ARG YMIR="1.1.0" ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX" ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all" ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" ENV LANG=C.UTF-8 +ENV YMIR_VERSION=${YMIR} # Install linux package RUN apt-get update && apt-get install -y gnupg2 git libglib2.0-0 \ - libgl1-mesa-glx curl wget zip \ + libgl1-mesa-glx libsm6 libxext6 libxrender-dev curl wget zip vim \ + build-essential ninja-build \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* # install ymir-exc sdk -RUN if [ "${SERVER_MODE}" = "dev" ]; then \ - pip install --force-reinstall -U "git+https://github.com/IndustryEssentials/ymir.git/@dev#egg=ymir-exc&subdirectory=docker_executor/sample_executor/ymir_exc"; \ - else \ - pip install ymir-exc; \ - fi +RUN pip install "git+https://github.com/modelai/ymir-executor-sdk.git@ymir1.0.0" # Copy file from host to docker and install requirements -ADD ./det-yolov5-tmi /app +COPY . /app RUN mkdir /img-man && mv /app/*-template.yaml /img-man/ \ && pip install -r /app/requirements.txt diff --git a/det-yolov5-tmi/cuda111.dockerfile b/det-yolov5-tmi/ymir/docker/cuda111.dockerfile similarity index 66% rename from det-yolov5-tmi/cuda111.dockerfile rename to det-yolov5-tmi/ymir/docker/cuda111.dockerfile index 0c6e5dd..84427a8 100644 --- a/det-yolov5-tmi/cuda111.dockerfile +++ b/det-yolov5-tmi/ymir/docker/cuda111.dockerfile @@ -4,30 +4,31 @@ ARG CUDNN="8" # cuda11.1 + pytorch 1.9.0 + cudnn8 not work!!! FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-runtime -ARG SERVER_MODE=prod +# support YMIR=1.0.0, 1.1.0 or 1.2.0 +ARG YMIR="1.1.0" + ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX" ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all" ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" ENV LANG=C.UTF-8 +ENV YMIR_VERSION=$YMIR # Install linux package RUN apt-get update && apt-get install -y gnupg2 git libglib2.0-0 \ - libgl1-mesa-glx curl wget zip \ + libgl1-mesa-glx libsm6 libxext6 libxrender-dev curl wget zip vim \ + build-essential ninja-build \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* -# install ymir-exc sdk -RUN if [ "${SERVER_MODE}" = "dev" ]; then \ - pip install --force-reinstall -U "git+https://github.com/IndustryEssentials/ymir.git/@dev#egg=ymir-exc&subdirectory=docker_executor/sample_executor/ymir_exc"; \ - else \ - pip install ymir-exc; \ - fi +COPY ./requirements.txt /workspace/ +# install ymir-exc sdk and requirements +RUN pip install "git+https://github.com/modelai/ymir-executor-sdk.git@ymir1.0.0" \ + && pip install -r /workspace/requirements.txt # Copy file from host to docker and install requirements -ADD ./det-yolov5-tmi /app -RUN mkdir /img-man && mv /app/*-template.yaml /img-man/ \ - && pip install -r /app/requirements.txt +COPY . /app +RUN mkdir /img-man && mv /app/*-template.yaml /img-man/ # Download pretrained weight and font file RUN cd /app && bash data/scripts/download_weights.sh \ diff --git a/det-yolov5-tmi/infer-template.yaml b/det-yolov5-tmi/ymir/img-man/infer-template.yaml similarity index 83% rename from det-yolov5-tmi/infer-template.yaml rename to det-yolov5-tmi/ymir/img-man/infer-template.yaml index 89dcc96..329887a 100644 --- a/det-yolov5-tmi/infer-template.yaml +++ b/det-yolov5-tmi/ymir/img-man/infer-template.yaml @@ -10,3 +10,6 @@ img_size: 640 conf_thres: 0.25 iou_thres: 0.45 +batch_size_per_gpu: 16 +num_workers_per_gpu: 4 +pin_memory: False diff --git a/det-yolov5-tmi/mining-template.yaml b/det-yolov5-tmi/ymir/img-man/mining-template.yaml similarity index 68% rename from det-yolov5-tmi/mining-template.yaml rename to det-yolov5-tmi/ymir/img-man/mining-template.yaml index 20106dc..485c8bb 100644 --- a/det-yolov5-tmi/mining-template.yaml +++ b/det-yolov5-tmi/ymir/img-man/mining-template.yaml @@ -8,5 +8,11 @@ # class_names: [] img_size: 640 +mining_algorithm: aldd +class_distribution_scores: '' # 1.0,1.0,0.1,0.2 conf_thres: 0.25 iou_thres: 0.45 +batch_size_per_gpu: 16 +num_workers_per_gpu: 4 +pin_memory: False +shm_size: 128G diff --git a/det-yolov5-tmi/training-template.yaml b/det-yolov5-tmi/ymir/img-man/training-template.yaml similarity index 54% rename from det-yolov5-tmi/training-template.yaml rename to det-yolov5-tmi/ymir/img-man/training-template.yaml index c6d0ee4..1cc4752 100644 --- a/det-yolov5-tmi/training-template.yaml +++ b/det-yolov5-tmi/ymir/img-man/training-template.yaml @@ -7,10 +7,16 @@ # pretrained_model_params: [] # class_names: [] +shm_size: '128G' +export_format: 'ark:raw' model: 'yolov5s' -batch_size: 16 -epochs: 300 +batch_size_per_gpu: 16 +num_workers_per_gpu: 4 +epochs: 100 img_size: 640 opset: 11 args_options: '--exist-ok' +save_best_only: True # save the best weight file only save_period: 10 +sync_bn: False # work for multi-gpu only +ymir_saved_file_patterns: '' # custom saved files, support python regular expression, use , to split multiple pattern diff --git a/det-yolov5-tmi/mining/data_augment.py b/det-yolov5-tmi/ymir/mining/data_augment.py similarity index 91% rename from det-yolov5-tmi/mining/data_augment.py rename to det-yolov5-tmi/ymir/mining/data_augment.py index 47b1d50..d88a86d 100644 --- a/det-yolov5-tmi/mining/data_augment.py +++ b/det-yolov5-tmi/ymir/mining/data_augment.py @@ -8,8 +8,7 @@ import cv2 import numpy as np from nptyping import NDArray - -from utils.ymir_yolov5 import BBOX, CV_IMAGE +from ymir.ymir_yolov5 import BBOX, CV_IMAGE def intersect(boxes1: BBOX, boxes2: BBOX) -> NDArray: @@ -23,11 +22,13 @@ def intersect(boxes1: BBOX, boxes2: BBOX) -> NDArray: ''' n1 = boxes1.shape[0] n2 = boxes2.shape[0] - max_xy = np.minimum(np.expand_dims(boxes1[:, 2:], axis=1).repeat(n2, axis=1), - np.expand_dims(boxes2[:, 2:], axis=0).repeat(n1, axis=0)) + max_xy = np.minimum( + np.expand_dims(boxes1[:, 2:], axis=1).repeat(n2, axis=1), + np.expand_dims(boxes2[:, 2:], axis=0).repeat(n1, axis=0)) - min_xy = np.maximum(np.expand_dims(boxes1[:, :2], axis=1).repeat(n2, axis=1), - np.expand_dims(boxes2[:, :2], axis=0).repeat(n1, axis=0)) + min_xy = np.maximum( + np.expand_dims(boxes1[:, :2], axis=1).repeat(n2, axis=1), + np.expand_dims(boxes2[:, :2], axis=0).repeat(n1, axis=0)) inter = np.clip(max_xy - min_xy, a_min=0, a_max=None) # (n1, n2, 2) return inter[:, :, 0] * inter[:, :, 1] # (n1, n2) @@ -50,8 +51,12 @@ def horizontal_flip(image: CV_IMAGE, bbox: BBOX) \ return image, bbox -def cutout(image: CV_IMAGE, bbox: BBOX, cut_num: int = 2, fill_val: int = 0, - bbox_remove_thres: float = 0.4, bbox_min_thres: float = 0.1) -> Tuple[CV_IMAGE, BBOX]: +def cutout(image: CV_IMAGE, + bbox: BBOX, + cut_num: int = 2, + fill_val: int = 0, + bbox_remove_thres: float = 0.4, + bbox_min_thres: float = 0.1) -> Tuple[CV_IMAGE, BBOX]: ''' Cutout augmentation image: A PIL image diff --git a/det-yolov5-tmi/ymir/mining/util.py b/det-yolov5-tmi/ymir/mining/util.py new file mode 100644 index 0000000..0e9e3f5 --- /dev/null +++ b/det-yolov5-tmi/ymir/mining/util.py @@ -0,0 +1,149 @@ +"""run.py: +img --(model)--> pred --(augmentation)--> (aug1_pred, aug2_pred, ..., augN_pred) +img --(augmentation)--> aug1_img --(model)--> pred1 +img --(augmentation)--> aug2_img --(model)--> pred2 +... +img --(augmentation)--> augN_img --(model)--> predN + +dataload(img) --(model)--> pred +dataload(img, pred) --(augmentation1)--> (aug1_img, aug1_pred) --(model)--> pred1 + +1. split dataset with DDP sampler +2. use DDP model to infer sampled dataloader +3. gather infer result + +""" +import os +from typing import Any, List + +import cv2 +import numpy as np +import torch.utils.data as td +from nptyping import NDArray +from scipy.stats import entropy +from torch.utils.data._utils.collate import default_collate +from utils.augmentations import letterbox +from ymir.mining.data_augment import cutout, horizontal_flip, intersect, resize, rotate +from ymir.ymir_yolov5 import BBOX + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def get_ious(boxes1: BBOX, boxes2: BBOX) -> NDArray: + """ + args: + boxes1: np.array, (N, 4), xyxy + boxes2: np.array, (M, 4), xyxy + return: + iou: np.array, (N, M) + """ + area1 = (boxes1[:, 2] - boxes1[:, 0]) * (boxes1[:, 3] - boxes1[:, 1]) + area2 = (boxes2[:, 2] - boxes2[:, 0]) * (boxes2[:, 3] - boxes2[:, 1]) + iner_area = intersect(boxes1, boxes2) + area1 = area1.reshape(-1, 1).repeat(area2.shape[0], axis=1) + area2 = area2.reshape(1, -1).repeat(area1.shape[0], axis=0) + iou = iner_area / (area1 + area2 - iner_area + 1e-14) + return iou + + +def preprocess(img, img_size, stride): + img1 = letterbox(img, img_size, stride=stride, auto=False)[0] + + # preprocess: convert data format + img1 = img1.transpose((2, 0, 1))[::-1] # HWC to CHW, BGR to RGB + img1 = np.ascontiguousarray(img1) + # img1 = torch.from_numpy(img1).to(self.device) + + img1 = img1 / 255 # 0 - 255 to 0.0 - 1.0 + return img1 + + +def load_image_file(img_file: str, img_size, stride): + img = cv2.imread(img_file) + img1 = letterbox(img, img_size, stride=stride, auto=False)[0] + + # preprocess: convert data format + img1 = img1.transpose((2, 0, 1))[::-1] # HWC to CHW, BGR to RGB + img1 = np.ascontiguousarray(img1) + # img1 = torch.from_numpy(img1).to(self.device) + + img1 = img1 / 255 # 0 - 255 to 0.0 - 1.0 + # img1.unsqueeze_(dim=0) # expand for batch dim + return dict(image=img1, origin_shape=img.shape[0:2], image_file=img_file) + # return img1 + + +def load_image_file_with_ann(image_info: dict, img_size, stride): + img_file = image_info['image_file'] + # xyxy(int) conf(float) class_index(int) + bboxes = image_info['results'][:, :4].astype(np.int32) + img = cv2.imread(img_file) + aug_dict = dict(flip=horizontal_flip, cutout=cutout, rotate=rotate, resize=resize) + + data = dict(image_file=img_file, origin_shape=img.shape[0:2]) + for key in aug_dict: + aug_img, aug_bbox = aug_dict[key](img, bboxes) + preprocess_aug_img = preprocess(aug_img, img_size, stride) + data[f'image_{key}'] = preprocess_aug_img + data[f'bboxes_{key}'] = aug_bbox + data[f'origin_shape_{key}'] = aug_img.shape[0:2] + + data.update(image_info) + return data + + +def collate_fn_with_fake_ann(batch): + new_batch = dict() + for key in ['flip', 'cutout', 'rotate', 'resize']: + new_batch[f'bboxes_{key}_list'] = [data[f'bboxes_{key}'] for data in batch] + + new_batch[f'image_{key}'] = default_collate([data[f'image_{key}'] for data in batch]) + + new_batch[f'origin_shape_{key}'] = default_collate([data[f'origin_shape_{key}'] for data in batch]) + + new_batch['results_list'] = [data['results'] for data in batch] + new_batch['image_file'] = [data['image_file'] for data in batch] + + return new_batch + + +def update_consistency(consistency, consistency_per_aug, beta, pred_bboxes_key, pred_conf_key, aug_bboxes_key, + aug_conf): + cls_scores_aug = 1 - pred_conf_key + cls_scores = 1 - aug_conf + + consistency_per_aug = 2.0 + ious = get_ious(pred_bboxes_key, aug_bboxes_key) + aug_idxs = np.argmax(ious, axis=0) + for origin_idx, aug_idx in enumerate(aug_idxs): + max_iou = ious[aug_idx, origin_idx] + if max_iou == 0: + consistency_per_aug = min(consistency_per_aug, beta) + p = cls_scores_aug[aug_idx] + q = cls_scores[origin_idx] + m = (p + q) / 2. + js = 0.5 * entropy([p, 1 - p], [m, 1 - m]) + 0.5 * entropy([q, 1 - q], [m, 1 - m]) + if js < 0: + js = 0 + consistency_box = max_iou + consistency_cls = 0.5 * (aug_conf[origin_idx] + pred_conf_key[aug_idx]) * (1 - js) + consistency_per_inst = abs(consistency_box + consistency_cls - beta) + consistency_per_aug = min(consistency_per_aug, consistency_per_inst.item()) + + consistency += consistency_per_aug + return consistency + + +class YmirDataset(td.Dataset): + def __init__(self, images: List[Any], load_fn=None): + super().__init__() + self.images = images + self.load_fn = load_fn + + def __getitem__(self, index): + return self.load_fn(self.images[index]) + + def __len__(self): + return len(self.images) diff --git a/det-yolov5-tmi/ymir/mining/ymir_infer.py b/det-yolov5-tmi/ymir/mining/ymir_infer.py new file mode 100644 index 0000000..bd1c237 --- /dev/null +++ b/det-yolov5-tmi/ymir/mining/ymir_infer.py @@ -0,0 +1,133 @@ +"""use fake DDP to infer +1. split data with `images_rank = images[RANK::WORLD_SIZE]` +2. save splited result with `torch.save(results, f'results_{RANK}.pt')` +3. merge result +""" +import os +import sys +import warnings +from functools import partial + +import torch +import torch.distributed as dist +import torch.utils.data as td +from easydict import EasyDict as edict +from tqdm import tqdm +from utils.general import scale_coords +from ymir.mining.util import YmirDataset, load_image_file +from ymir.ymir_yolov5 import YmirYolov5 +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def run(ymir_cfg: edict, ymir_yolov5: YmirYolov5): + # eg: gpu_id = 1,3,5,7 for LOCAL_RANK = 2, will use gpu 5. + gpu = max(0, LOCAL_RANK) + device = torch.device('cuda', gpu) + ymir_yolov5.to(device) + + load_fn = partial(load_image_file, img_size=ymir_yolov5.img_size, stride=ymir_yolov5.stride) + batch_size_per_gpu = ymir_yolov5.batch_size_per_gpu + gpu_count = ymir_yolov5.gpu_count + cpu_count: int = os.cpu_count() or 1 + num_workers_per_gpu = min([ + cpu_count // max(gpu_count, 1), batch_size_per_gpu if batch_size_per_gpu > 1 else 0, + ymir_yolov5.num_workers_per_gpu + ]) + + with open(ymir_cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = len(images) // max(1, WORLD_SIZE) // batch_size_per_gpu + # origin dataset + if RANK != -1: + images_rank = images[RANK::WORLD_SIZE] + else: + images_rank = images + origin_dataset = YmirDataset(images_rank, load_fn=load_fn) + origin_dataset_loader = td.DataLoader(origin_dataset, + batch_size=batch_size_per_gpu, + shuffle=False, + sampler=None, + num_workers=num_workers_per_gpu, + pin_memory=ymir_yolov5.pin_memory, + drop_last=False) + + results = [] + dataset_size = len(images_rank) + monitor_gap = max(1, dataset_size // 1000 // batch_size_per_gpu) + pbar = tqdm(origin_dataset_loader) if RANK == 0 else origin_dataset_loader + for idx, batch in enumerate(pbar): + # batch-level sync, avoid 30min time-out error + if LOCAL_RANK != -1 and idx < max_barrier_times: + dist.barrier() + + with torch.no_grad(): + pred = ymir_yolov5.forward(batch['image'].float().to(device), nms=True) + + if idx % monitor_gap == 0: + ymir_yolov5.write_monitor_logger(stage=YmirStage.TASK, p=idx * batch_size_per_gpu / dataset_size) + + preprocess_image_shape = batch['image'].shape[2:] + for idx, det in enumerate(pred): # per image + result_per_image = [] + image_file = batch['image_file'][idx] + if len(det): + origin_image_shape = (batch['origin_shape'][0][idx], batch['origin_shape'][1][idx]) + # Rescale boxes from img_size to img size + det[:, :4] = scale_coords(preprocess_image_shape, det[:, :4], origin_image_shape).round() + result_per_image.append(det) + results.append(dict(image_file=image_file, result=result_per_image)) + + torch.save(results, f'/out/infer_results_{max(0,RANK)}.pt') + + +def main() -> int: + ymir_cfg = get_merged_config() + ymir_yolov5 = YmirYolov5(ymir_cfg, task='infer') + + if LOCAL_RANK != -1: + assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command' + torch.cuda.set_device(LOCAL_RANK) + dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo") + + run(ymir_cfg, ymir_yolov5) + + # wait all process to save the infer result + dist.barrier() + + if RANK in [0, -1]: + results = [] + for rank in range(WORLD_SIZE): + results.append(torch.load(f'/out/infer_results_{rank}.pt')) + + ymir_infer_result = dict() + for result in results: + for img_data in result: + img_file = img_data['image_file'] + anns = [] + for each_det in img_data['result']: + each_det_np = each_det.data.cpu().numpy() + for i in range(each_det_np.shape[0]): + xmin, ymin, xmax, ymax, conf, cls = each_det_np[i, :6].tolist() + if conf < ymir_yolov5.conf_thres: + continue + if int(cls) >= len(ymir_yolov5.class_names): + warnings.warn(f'class index {int(cls)} out of range for {ymir_yolov5.class_names}') + continue + ann = rw.Annotation(class_name=ymir_yolov5.class_names[int(cls)], + score=conf, + box=rw.Box(x=int(xmin), y=int(ymin), w=int(xmax - xmin), + h=int(ymax - ymin))) + anns.append(ann) + ymir_infer_result[img_file] = anns + rw.write_infer_result(infer_result=ymir_infer_result) + return 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/det-yolov5-tmi/ymir/mining/ymir_mining_aldd.py b/det-yolov5-tmi/ymir/mining/ymir_mining_aldd.py new file mode 100644 index 0000000..0a90e3f --- /dev/null +++ b/det-yolov5-tmi/ymir/mining/ymir_mining_aldd.py @@ -0,0 +1,210 @@ +"""use fake DDP to infer +1. split data with `images_rank = images[RANK::WORLD_SIZE]` +2. infer on the origin dataset +3. infer on the augmentation dataset +4. save splited mining result with `torch.save(results, f'/out/mining_results_{RANK}.pt')` +5. merge mining result +""" +import os +import sys +import warnings +from functools import partial +from typing import Any, List + +import numpy as np +import torch +import torch.distributed as dist +import torch.nn.functional as F +import torch.utils.data as td +from easydict import EasyDict as edict +from tqdm import tqdm +from ymir.mining.util import YmirDataset, load_image_file +from ymir.ymir_yolov5 import YmirYolov5 +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +class ALDD(object): + + def __init__(self, ymir_cfg: edict): + self.avg_pool_size = 9 + self.max_pool_size = 32 + self.avg_pool_pad = (self.avg_pool_size - 1) // 2 + + self.num_classes = len(ymir_cfg.param.class_names) + if ymir_cfg.param.get('class_distribution_scores', ''): + scores = [float(x.strip()) for x in ymir_cfg.param.class_distribution_scores.split(',')] + if len(scores) < self.num_classes: + warnings.warn('extend 1.0 to class_distribution_scores') + scores.extend([1.0] * (self.num_classes - len(scores))) + self.class_distribution_scores = np.array(scores[0:self.num_classes], dtype=np.float32) + else: + self.class_distribution_scores = np.array([1.0] * self.num_classes, dtype=np.float32) + + def calc_unc_val(self, heatmap: torch.Tensor) -> torch.Tensor: + # mean of entropy + ent = F.binary_cross_entropy(heatmap, heatmap, reduction='none') + avg_ent = F.avg_pool2d(ent, + kernel_size=self.avg_pool_size, + stride=1, + padding=self.avg_pool_pad, + count_include_pad=False) # N, 1, H, W + mean_of_entropy = torch.sum(avg_ent, dim=1, keepdim=True) # N, 1, H, W + + # entropy of mean + avg_heatmap = F.avg_pool2d(heatmap, + kernel_size=self.avg_pool_size, + stride=1, + padding=self.avg_pool_pad, + count_include_pad=False) # N, C, H, W + ent_avg = F.binary_cross_entropy(avg_heatmap, avg_heatmap, reduction='none') + entropy_of_mean = torch.sum(ent_avg, dim=1, keepdim=True) # N, 1, H, W + + uncertainty = entropy_of_mean - mean_of_entropy + unc = F.max_pool2d(uncertainty, + kernel_size=self.max_pool_size, + stride=self.max_pool_size, + padding=0, + ceil_mode=False) + + # aggregating + scores = torch.mean(unc, dim=(1, 2, 3)) # (N,) + return scores + + def compute_aldd_score(self, net_output: List[torch.Tensor], net_input_shape: Any): + """ + args: + imgs: list[np.array(H, W, C)] + returns: + scores: list of float + """ + if not isinstance(net_input_shape, (list, tuple)): + net_input_shape = (net_input_shape, net_input_shape) + + # CLASS_DISTRIBUTION_SCORE = np.array([1.0] * num_of_class) + scores_list = [] + + for feature_map in net_output: + feature_map.sigmoid_() + + for each_class_index in range(self.num_classes): + feature_map_list: List[torch.Tensor] = [] + + # each_output_feature_map: [bs, 3, h, w, 5 + num_classes] + for each_output_feature_map in net_output: + net_output_conf = each_output_feature_map[:, :, :, :, 4] + net_output_cls_mult_conf = net_output_conf * each_output_feature_map[:, :, :, :, 5 + each_class_index] + # feature_map_reshape: [bs, 3, h, w] + feature_map_reshape = F.interpolate(net_output_cls_mult_conf, + net_input_shape, + mode='bilinear', + align_corners=False) + feature_map_list.append(feature_map_reshape) + + # len(net_output) = 3 + # feature_map_concate: [bs, 9, h, w] + feature_map_concate = torch.cat(feature_map_list, 1) + # scores: [bs, 1] for each class + scores = self.calc_unc_val(feature_map_concate) + scores = scores.cpu().detach().numpy() + scores_list.append(scores) + + # total_scores: [bs, num_classes] + total_scores = np.stack(scores_list, axis=1) + total_scores = total_scores * self.class_distribution_scores + total_scores = np.sum(total_scores, axis=1) + + return total_scores + + +def run(ymir_cfg: edict, ymir_yolov5: YmirYolov5): + # eg: gpu_id = 1,3,5,7 for LOCAL_RANK = 2, will use gpu 5. + gpu = LOCAL_RANK if LOCAL_RANK >= 0 else 0 + device = torch.device('cuda', gpu) + ymir_yolov5.to(device) + + load_fn = partial(load_image_file, img_size=ymir_yolov5.img_size, stride=ymir_yolov5.stride) + batch_size_per_gpu: int = ymir_yolov5.batch_size_per_gpu + gpu_count: int = ymir_yolov5.gpu_count + cpu_count: int = os.cpu_count() or 1 + num_workers_per_gpu = min([ + cpu_count // max(gpu_count, 1), batch_size_per_gpu if batch_size_per_gpu > 1 else 0, + ymir_yolov5.num_workers_per_gpu + ]) + + with open(ymir_cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = (len(images) // max(1, WORLD_SIZE)) // batch_size_per_gpu + + # origin dataset + if RANK != -1: + images_rank = images[RANK::WORLD_SIZE] + else: + images_rank = images + origin_dataset = YmirDataset(images_rank, load_fn=load_fn) + origin_dataset_loader = td.DataLoader(origin_dataset, + batch_size=batch_size_per_gpu, + shuffle=False, + sampler=None, + num_workers=num_workers_per_gpu, + pin_memory=ymir_yolov5.pin_memory, + drop_last=False) + + mining_results = dict() + dataset_size = len(images_rank) + pbar = tqdm(origin_dataset_loader) if RANK == 0 else origin_dataset_loader + miner = ALDD(ymir_cfg) + for idx, batch in enumerate(pbar): + # batch-level sync, avoid 30min time-out error + if LOCAL_RANK != -1 and idx < max_barrier_times: + dist.barrier() + + with torch.no_grad(): + featuremap_output = ymir_yolov5.model.model(batch['image'].float().to(device))[1] + unc_scores = miner.compute_aldd_score(featuremap_output, ymir_yolov5.img_size) + + for each_imgname, each_score in zip(batch["image_file"], unc_scores): + mining_results[each_imgname] = each_score + + if RANK in [-1, 0]: + ymir_yolov5.write_monitor_logger(stage=YmirStage.TASK, p=idx * batch_size_per_gpu / dataset_size) + + torch.save(mining_results, f'/out/mining_results_{max(0,RANK)}.pt') + + +def main() -> int: + ymir_cfg = get_merged_config() + # note select_device(gpu_id) will set os.environ['CUDA_VISIBLE_DEVICES'] to gpu_id + ymir_yolov5 = YmirYolov5(ymir_cfg, task='mining') + + if LOCAL_RANK != -1: + assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command' + torch.cuda.set_device(LOCAL_RANK) + dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo") + + run(ymir_cfg, ymir_yolov5) + + # wait all process to save the mining result + if LOCAL_RANK != -1: + dist.barrier() + + if RANK in [0, -1]: + results = [] + for rank in range(WORLD_SIZE): + results.append(torch.load(f'/out/mining_results_{rank}.pt')) + + ymir_mining_result = [] + for result in results: + for img_file, score in result.items(): + ymir_mining_result.append((img_file, score)) + rw.write_mining_result(mining_result=ymir_mining_result) + return 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/det-yolov5-tmi/ymir/mining/ymir_mining_cald.py b/det-yolov5-tmi/ymir/mining/ymir_mining_cald.py new file mode 100644 index 0000000..4a07d32 --- /dev/null +++ b/det-yolov5-tmi/ymir/mining/ymir_mining_cald.py @@ -0,0 +1,193 @@ +"""use fake DDP to infer +1. split data with `images_rank = images[RANK::WORLD_SIZE]` +2. infer on the origin dataset +3. infer on the augmentation dataset +4. save splited mining result with `torch.save(results, f'/out/mining_results_{RANK}.pt')` +5. merge mining result +""" +import os +import sys +from functools import partial + +import numpy as np +import torch +import torch.distributed as dist +import torch.utils.data as td +from easydict import EasyDict as edict +from tqdm import tqdm +from utils.general import scale_coords +from ymir.mining.util import (YmirDataset, collate_fn_with_fake_ann, load_image_file, load_image_file_with_ann, + update_consistency) +from ymir.ymir_yolov5 import YmirYolov5 +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def run(ymir_cfg: edict, ymir_yolov5: YmirYolov5): + # eg: gpu_id = 1,3,5,7 for LOCAL_RANK = 2, will use gpu 5. + gpu = LOCAL_RANK if LOCAL_RANK >= 0 else 0 + device = torch.device('cuda', gpu) + ymir_yolov5.to(device) + + load_fn = partial(load_image_file, img_size=ymir_yolov5.img_size, stride=ymir_yolov5.stride) + batch_size_per_gpu: int = ymir_yolov5.batch_size_per_gpu + gpu_count: int = ymir_yolov5.gpu_count + cpu_count: int = os.cpu_count() or 1 + num_workers_per_gpu = min([ + cpu_count // max(gpu_count, 1), batch_size_per_gpu if batch_size_per_gpu > 1 else 0, + ymir_yolov5.num_workers_per_gpu + ]) + + with open(ymir_cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = (len(images) // max(1, WORLD_SIZE)) // batch_size_per_gpu + # origin dataset + if RANK != -1: + images_rank = images[RANK::WORLD_SIZE] + else: + images_rank = images + origin_dataset = YmirDataset(images_rank, load_fn=load_fn) + origin_dataset_loader = td.DataLoader(origin_dataset, + batch_size=batch_size_per_gpu, + shuffle=False, + sampler=None, + num_workers=num_workers_per_gpu, + pin_memory=ymir_yolov5.pin_memory, + drop_last=False) + + results = [] + mining_results = dict() + beta = 1.3 + dataset_size = len(images_rank) + pbar = tqdm(origin_dataset_loader) if RANK == 0 else origin_dataset_loader + for idx, batch in enumerate(pbar): + # batch-level sync, avoid 30min time-out error + if LOCAL_RANK != -1 and idx < max_barrier_times: + dist.barrier() + + with torch.no_grad(): + pred = ymir_yolov5.forward(batch['image'].float().to(device), nms=True) + + if RANK in [-1, 0]: + ymir_yolov5.write_monitor_logger(stage=YmirStage.TASK, p=idx * batch_size_per_gpu / dataset_size) + preprocess_image_shape = batch['image'].shape[2:] + for inner_idx, det in enumerate(pred): # per image + result_per_image = [] + image_file = batch['image_file'][inner_idx] + if len(det): + origin_image_shape = (batch['origin_shape'][0][inner_idx], batch['origin_shape'][1][inner_idx]) + # Rescale boxes from img_size to img size + det[:, :4] = scale_coords(preprocess_image_shape, det[:, :4], origin_image_shape).round() + result_per_image.append(det) + else: + mining_results[image_file] = -beta + continue + + results_per_image = torch.cat(result_per_image, dim=0).data.cpu().numpy() + results.append(dict(image_file=image_file, origin_shape=origin_image_shape, results=results_per_image)) + + aug_load_fn = partial(load_image_file_with_ann, img_size=ymir_yolov5.img_size, stride=ymir_yolov5.stride) + aug_dataset = YmirDataset(results, load_fn=aug_load_fn) + aug_dataset_loader = td.DataLoader(aug_dataset, + batch_size=batch_size_per_gpu, + shuffle=False, + sampler=None, + collate_fn=collate_fn_with_fake_ann, + num_workers=num_workers_per_gpu, + pin_memory=ymir_yolov5.pin_memory, + drop_last=False) + + # cannot sync here!!! + dataset_size = len(results) + monitor_gap = max(1, dataset_size // 1000 // batch_size_per_gpu) + pbar = tqdm(aug_dataset_loader) if RANK == 0 else aug_dataset_loader + for idx, batch in enumerate(pbar): + if idx % monitor_gap == 0 and RANK in [-1, 0]: + ymir_yolov5.write_monitor_logger(stage=YmirStage.TASK, p=idx * batch_size_per_gpu / dataset_size) + + batch_consistency = [0.0 for _ in range(len(batch['image_file']))] + aug_keys = ['flip', 'cutout', 'rotate', 'resize'] + + pred_result = dict() + for key in aug_keys: + with torch.no_grad(): + pred_result[key] = ymir_yolov5.forward(batch[f'image_{key}'].float().to(device), nms=True) + + for inner_idx in range(len(batch['image_file'])): + for key in aug_keys: + preprocess_image_shape = batch[f'image_{key}'].shape[2:] + result_per_image = [] + det = pred_result[key][inner_idx] + if len(det) == 0: + # no result for the image with augmentation f'{key}' + batch_consistency[inner_idx] += beta + continue + + # prediction result from origin image + fake_ann = batch['results_list'][inner_idx] + # bboxes = fake_ann[:, :4].data.cpu().numpy().astype(np.int32) + conf = fake_ann[:, 4] + + # augmentated bbox from bboxes, aug_conf = conf + aug_bboxes_key = batch[f'bboxes_{key}_list'][inner_idx].astype(np.int32) + + origin_image_shape = (batch[f'origin_shape_{key}'][0][inner_idx], + batch[f'origin_shape_{key}'][1][inner_idx]) + + # Rescale boxes from img_size to img size + det[:, :4] = scale_coords(preprocess_image_shape, det[:, :4], origin_image_shape).round() + result_per_image.append(det) + + pred_bboxes_key = det[:, :4].data.cpu().numpy().astype(np.int32) + pred_conf_key = det[:, 4].data.cpu().numpy() + batch_consistency[inner_idx] = update_consistency(consistency=batch_consistency[inner_idx], + consistency_per_aug=2.0, + beta=beta, + pred_bboxes_key=pred_bboxes_key, + pred_conf_key=pred_conf_key, + aug_bboxes_key=aug_bboxes_key, + aug_conf=conf) + + for inner_idx in range(len(batch['image_file'])): + batch_consistency[inner_idx] /= len(aug_keys) + image_file = batch['image_file'][inner_idx] + mining_results[image_file] = batch_consistency[inner_idx] + + torch.save(mining_results, f'/out/mining_results_{max(0,RANK)}.pt') + + +def main() -> int: + ymir_cfg = get_merged_config() + ymir_yolov5 = YmirYolov5(ymir_cfg, task='mining') + + if LOCAL_RANK != -1: + assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command' + torch.cuda.set_device(LOCAL_RANK) + dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo") + + run(ymir_cfg, ymir_yolov5) + + # wait all process to save the mining result + if LOCAL_RANK != -1: + dist.barrier() + + if RANK in [0, -1]: + results = [] + for rank in range(WORLD_SIZE): + results.append(torch.load(f'/out/mining_results_{rank}.pt')) + + ymir_mining_result = [] + for result in results: + for img_file, score in result.items(): + ymir_mining_result.append((img_file, score)) + rw.write_mining_result(mining_result=ymir_mining_result) + return 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/det-yolov5-tmi/ymir/mining/ymir_mining_entropy.py b/det-yolov5-tmi/ymir/mining/ymir_mining_entropy.py new file mode 100644 index 0000000..86136e1 --- /dev/null +++ b/det-yolov5-tmi/ymir/mining/ymir_mining_entropy.py @@ -0,0 +1,115 @@ +"""use fake DDP to infer +1. split data with `images_rank = images[RANK::WORLD_SIZE]` +2. infer on the origin dataset +3. infer on the augmentation dataset +4. save splited mining result with `torch.save(results, f'/out/mining_results_{RANK}.pt')` +5. merge mining result +""" +import os +import sys +from functools import partial + +import numpy as np +import torch +import torch.distributed as dist +import torch.utils.data as td +from easydict import EasyDict as edict +from tqdm import tqdm +from ymir.mining.util import YmirDataset, load_image_file +from ymir.ymir_yolov5 import YmirYolov5 +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def run(ymir_cfg: edict, ymir_yolov5: YmirYolov5): + # eg: gpu_id = 1,3,5,7 for LOCAL_RANK = 2, will use gpu 5. + gpu = LOCAL_RANK if LOCAL_RANK >= 0 else 0 + device = torch.device('cuda', gpu) + ymir_yolov5.to(device) + + load_fn = partial(load_image_file, img_size=ymir_yolov5.img_size, stride=ymir_yolov5.stride) + batch_size_per_gpu: int = ymir_yolov5.batch_size_per_gpu + gpu_count: int = ymir_yolov5.gpu_count + cpu_count: int = os.cpu_count() or 1 + num_workers_per_gpu = min([ + cpu_count // max(gpu_count, 1), batch_size_per_gpu if batch_size_per_gpu > 1 else 0, + ymir_yolov5.num_workers_per_gpu + ]) + + with open(ymir_cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + max_barrier_times = (len(images) // max(1, WORLD_SIZE)) // batch_size_per_gpu + # origin dataset + if RANK != -1: + images_rank = images[RANK::WORLD_SIZE] + else: + images_rank = images + origin_dataset = YmirDataset(images_rank, load_fn=load_fn) + origin_dataset_loader = td.DataLoader(origin_dataset, + batch_size=batch_size_per_gpu, + shuffle=False, + sampler=None, + num_workers=num_workers_per_gpu, + pin_memory=ymir_yolov5.pin_memory, + drop_last=False) + + mining_results = dict() + dataset_size = len(images_rank) + pbar = tqdm(origin_dataset_loader) if RANK == 0 else origin_dataset_loader + for idx, batch in enumerate(pbar): + # batch-level sync, avoid 30min time-out error + if LOCAL_RANK != -1 and idx < max_barrier_times: + dist.barrier() + + with torch.no_grad(): + pred = ymir_yolov5.forward(batch['image'].float().to(device), nms=False) + + if RANK in [-1, 0]: + ymir_yolov5.write_monitor_logger(stage=YmirStage.TASK, p=idx * batch_size_per_gpu / dataset_size) + for inner_idx, det in enumerate(pred): # per image + image_file = batch['image_file'][inner_idx] + if len(det): + conf = det[:, 4].data.cpu().numpy() + mining_results[image_file] = -np.sum(conf * np.log2(conf)) + else: + mining_results[image_file] = -10 + continue + + torch.save(mining_results, f'/out/mining_results_{max(0,RANK)}.pt') + + +def main() -> int: + ymir_cfg = get_merged_config() + ymir_yolov5 = YmirYolov5(ymir_cfg, task='mining') + + if LOCAL_RANK != -1: + assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command' + torch.cuda.set_device(LOCAL_RANK) + dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo") + + run(ymir_cfg, ymir_yolov5) + + # wait all process to save the mining result + if WORLD_SIZE > 1: + dist.barrier() + + if RANK in [0, -1]: + results = [] + for rank in range(WORLD_SIZE): + results.append(torch.load(f'/out/mining_results_{rank}.pt')) + + ymir_mining_result = [] + for result in results: + for img_file, score in result.items(): + ymir_mining_result.append((img_file, score)) + rw.write_mining_result(mining_result=ymir_mining_result) + return 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/det-yolov5-tmi/ymir/mining/ymir_mining_random.py b/det-yolov5-tmi/ymir/mining/ymir_mining_random.py new file mode 100644 index 0000000..eeb08cf --- /dev/null +++ b/det-yolov5-tmi/ymir/mining/ymir_mining_random.py @@ -0,0 +1,78 @@ +"""use fake DDP to infer +1. split data with `images_rank = images[RANK::WORLD_SIZE]` +2. infer on the origin dataset +3. infer on the augmentation dataset +4. save splited mining result with `torch.save(results, f'/out/mining_results_{RANK}.pt')` +5. merge mining result +""" +import os +import random +import sys + +import torch +import torch.distributed as dist +from easydict import EasyDict as edict +from tqdm import tqdm +from ymir.ymir_yolov5 import YmirYolov5 +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_merged_config + +LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html +RANK = int(os.getenv('RANK', -1)) +WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1)) + + +def run(ymir_cfg: edict, ymir_yolov5: YmirYolov5): + # eg: gpu_id = 1,3,5,7 for LOCAL_RANK = 2, will use gpu 5. + gpu = LOCAL_RANK if LOCAL_RANK >= 0 else 0 + device = torch.device('cuda', gpu) + ymir_yolov5.to(device) + + with open(ymir_cfg.ymir.input.candidate_index_file, 'r') as f: + images = [line.strip() for line in f.readlines()] + + if RANK != -1: + images_rank = images[RANK::WORLD_SIZE] + else: + images_rank = images + mining_results = dict() + dataset_size = len(images_rank) + pbar = tqdm(images_rank) if RANK == 0 else images_rank + for idx, image in enumerate(pbar): + if RANK in [-1, 0]: + ymir_yolov5.write_monitor_logger(stage=YmirStage.TASK, p=idx / dataset_size) + mining_results[image] = random.random() + + torch.save(mining_results, f'/out/mining_results_{max(0,RANK)}.pt') + + +def main() -> int: + ymir_cfg = get_merged_config() + ymir_yolov5 = YmirYolov5(ymir_cfg, task='mining') + + if LOCAL_RANK != -1: + assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command' + torch.cuda.set_device(LOCAL_RANK) + dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo") + + run(ymir_cfg, ymir_yolov5) + + # wait all process to save the mining result + if WORLD_SIZE > 1: + dist.barrier() + + if RANK in [0, -1]: + results = [] + for rank in range(WORLD_SIZE): + results.append(torch.load(f'/out/mining_results_{rank}.pt')) + + ymir_mining_result = [] + for result in results: + for img_file, score in result.items(): + ymir_mining_result.append((img_file, score)) + rw.write_mining_result(mining_result=ymir_mining_result) + return 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/det-yolov5-tmi/ymir/start.py b/det-yolov5-tmi/ymir/start.py new file mode 100644 index 0000000..11eece0 --- /dev/null +++ b/det-yolov5-tmi/ymir/start.py @@ -0,0 +1,175 @@ +import logging +import os +import subprocess +import sys + +import cv2 +from easydict import EasyDict as edict +from models.experimental import attempt_download +from ymir.ymir_yolov5 import YmirYolov5, convert_ymir_to_yolov5, get_weight_file +from ymir_exc import dataset_reader as dr +from ymir_exc import env, monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, find_free_port, get_bool, get_merged_config, get_ymir_process + + +def start(cfg: edict) -> int: + logging.info(f'merged config: {cfg}') + + if cfg.ymir.run_training: + _run_training(cfg) + else: + if cfg.ymir.run_mining and cfg.ymir.run_infer: + # multiple task, run mining first, infer later + mining_task_idx = 0 + infer_task_idx = 1 + task_num = 2 + else: + mining_task_idx = 0 + infer_task_idx = 0 + task_num = 1 + + if cfg.ymir.run_mining: + _run_mining(cfg, mining_task_idx, task_num) + if cfg.ymir.run_infer: + _run_infer(cfg, infer_task_idx, task_num) + + return 0 + + +def _run_training(cfg: edict) -> None: + """ + function for training task + 1. convert dataset + 2. training model + 3. save model weight/hyperparameter/... to design directory + """ + # 1. convert dataset + out_dir = cfg.ymir.output.root_dir + convert_ymir_to_yolov5(cfg) + logging.info(f'generate {out_dir}/data.yaml') + monitor.write_monitor_logger(percent=get_ymir_process(stage=YmirStage.PREPROCESS, p=1.0)) + + # 2. training model + epochs: int = int(cfg.param.epochs) + batch_size_per_gpu: int = int(cfg.param.batch_size_per_gpu) + num_workers_per_gpu: int = int(cfg.param.get('num_workers_per_gpu', 4)) + model: str = cfg.param.model + img_size: int = int(cfg.param.img_size) + save_period: int = int(cfg.param.save_period) + save_best_only: bool = get_bool(cfg, key='save_best_only', default_value=True) + args_options: str = cfg.param.args_options + gpu_id: str = str(cfg.param.get('gpu_id', '0')) + gpu_count: int = len(gpu_id.split(',')) if gpu_id else 0 + batch_size: int = batch_size_per_gpu * max(1, gpu_count) + port: int = find_free_port() + sync_bn: bool = get_bool(cfg, key='sync_bn', default_value=False) + + weights = get_weight_file(cfg) + if not weights: + # download pretrained weight + weights = attempt_download(f'{model}.pt') + + models_dir = cfg.ymir.output.models_dir + project = os.path.dirname(models_dir) + name = os.path.basename(models_dir) + assert os.path.join(project, name) == models_dir + + commands = ['python3'] + device = gpu_id or 'cpu' + if gpu_count > 1: + commands.extend(f'-m torch.distributed.launch --nproc_per_node {gpu_count} --master_port {port}'.split()) + + commands.extend([ + 'train.py', '--epochs', + str(epochs), '--batch-size', + str(batch_size), '--data', f'{out_dir}/data.yaml', '--project', project, '--cfg', f'models/{model}.yaml', + '--name', name, '--weights', weights, '--img-size', + str(img_size), '--save-period', + str(save_period), '--device', device, + '--workers', str(num_workers_per_gpu) + ]) + + if save_best_only: + commands.append("--nosave") + + if gpu_count > 1 and sync_bn: + commands.append("--sync-bn") + + if args_options: + commands.extend(args_options.split()) + + logging.info(f'start training: {commands}') + + subprocess.run(commands, check=True) + monitor.write_monitor_logger(percent=get_ymir_process(stage=YmirStage.TASK, p=1.0)) + + # if task done, write 100% percent log + monitor.write_monitor_logger(percent=1.0) + + +def _run_mining(cfg: edict, task_idx: int = 0, task_num: int = 1) -> None: + # generate data.yaml for mining + out_dir = cfg.ymir.output.root_dir + convert_ymir_to_yolov5(cfg) + logging.info(f'generate {out_dir}/data.yaml') + monitor.write_monitor_logger( + percent=get_ymir_process(stage=YmirStage.PREPROCESS, p=1.0, task_idx=task_idx, task_num=task_num)) + gpu_id: str = str(cfg.param.get('gpu_id', '0')) + gpu_count: int = len(gpu_id.split(',')) if gpu_id else 0 + + mining_algorithm = cfg.param.get('mining_algorithm', 'aldd') + support_mining_algorithms = ['aldd', 'cald', 'random', 'entropy'] + if mining_algorithm not in support_mining_algorithms: + raise Exception(f'unknown mining algorithm {mining_algorithm}, not in {support_mining_algorithms}') + + if gpu_count <= 1: + command = f'python3 ymir/mining/ymir_mining_{mining_algorithm}.py' + else: + port = find_free_port() + command = f'python3 -m torch.distributed.launch --nproc_per_node {gpu_count} --master_port {port} ymir/mining/ymir_mining_{mining_algorithm}.py' # noqa + + logging.info(f'mining: {command}') + subprocess.run(command.split(), check=True) + monitor.write_monitor_logger( + percent=get_ymir_process(stage=YmirStage.POSTPROCESS, p=1.0, task_idx=task_idx, task_num=task_num)) + + +def _run_infer(cfg: edict, task_idx: int = 0, task_num: int = 1) -> None: + # generate data.yaml for infer + out_dir = cfg.ymir.output.root_dir + convert_ymir_to_yolov5(cfg) + logging.info(f'generate {out_dir}/data.yaml') + monitor.write_monitor_logger( + percent=get_ymir_process(stage=YmirStage.PREPROCESS, p=1.0, task_idx=task_idx, task_num=task_num)) + + gpu_id: str = str(cfg.param.get('gpu_id', '0')) + gpu_count: int = len(gpu_id.split(',')) if gpu_id else 0 + + if gpu_count <= 1: + command = 'python3 ymir/mining/ymir_infer.py' + else: + port = find_free_port() + command = f'python3 -m torch.distributed.launch --nproc_per_node {gpu_count} --master_port {port} ymir/mining/ymir_infer.py' # noqa + + logging.info(f'infer: {command}') + subprocess.run(command.split(), check=True) + + monitor.write_monitor_logger( + percent=get_ymir_process(stage=YmirStage.POSTPROCESS, p=1.0, task_idx=task_idx, task_num=task_num)) + + +if __name__ == '__main__': + logging.basicConfig(stream=sys.stdout, + format='%(levelname)-8s: [%(asctime)s] %(message)s', + datefmt='%Y%m%d-%H:%M:%S', + level=logging.INFO) + + cfg = get_merged_config() + os.environ.setdefault('PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION', 'python') + + # activation: relu + activation: str = cfg.param.get('activation', '') + if activation: + os.environ.setdefault('ACTIVATION', activation) + sys.exit(start(cfg)) diff --git a/det-yolov5-tmi/ymir/ymir_yolov5.py b/det-yolov5-tmi/ymir/ymir_yolov5.py new file mode 100644 index 0000000..c463ded --- /dev/null +++ b/det-yolov5-tmi/ymir/ymir_yolov5.py @@ -0,0 +1,187 @@ +""" +utils function for ymir and yolov5 +""" +import os.path as osp +import shutil +from typing import Any, List + +import numpy as np +import torch +import yaml +from easydict import EasyDict as edict +from models.common import DetectMultiBackend +from nptyping import NDArray, Shape, UInt8 +from utils.augmentations import letterbox +from utils.general import check_img_size, non_max_suppression, scale_coords +from utils.torch_utils import select_device +from ymir_exc import monitor +from ymir_exc import result_writer as rw +from ymir_exc.util import YmirStage, get_bool, get_weight_files, get_ymir_process + +BBOX = NDArray[Shape['*,4'], Any] +CV_IMAGE = NDArray[Shape['*,*,3'], UInt8] + + +def get_weight_file(cfg: edict) -> str: + """ + return the weight file path by priority + find weight file in cfg.param.model_params_path or cfg.param.model_params_path + """ + weight_files = get_weight_files(cfg, suffix=('.pt')) + # choose weight file by priority, best.pt > xxx.pt + for p in weight_files: + if p.endswith('best.pt'): + return p + + if len(weight_files) > 0: + return max(weight_files, key=osp.getctime) + + return "" + + +class YmirYolov5(torch.nn.Module): + """ + used for mining and inference to init detector and predict. + """ + def __init__(self, cfg: edict, task='infer'): + super().__init__() + self.cfg = cfg + if cfg.ymir.run_mining and cfg.ymir.run_infer: + # multiple task, run mining first, infer later + if task == 'infer': + self.task_idx = 1 + elif task == 'mining': + self.task_idx = 0 + else: + raise Exception(f'unknown task {task}') + + self.task_num = 2 + else: + self.task_idx = 0 + self.task_num = 1 + + self.gpu_id: str = str(cfg.param.get('gpu_id', '0')) + device = select_device(self.gpu_id) # will set CUDA_VISIBLE_DEVICES=self.gpu_id + self.gpu_count: int = len(self.gpu_id.split(',')) if self.gpu_id else 0 + self.batch_size_per_gpu: int = int(cfg.param.get('batch_size_per_gpu', 4)) + self.num_workers_per_gpu: int = int(cfg.param.get('num_workers_per_gpu', 4)) + self.pin_memory: bool = get_bool(cfg, 'pin_memory', False) + self.batch_size: int = self.batch_size_per_gpu * self.gpu_count + self.model = self.init_detector(device) + self.model.eval() + self.device = device + self.class_names: List[str] = cfg.param.class_names + self.stride = self.model.stride + self.conf_thres: float = float(cfg.param.conf_thres) + self.iou_thres: float = float(cfg.param.iou_thres) + + img_size = int(cfg.param.img_size) + imgsz = [img_size, img_size] + imgsz = check_img_size(imgsz, s=self.stride) + + self.model.warmup(imgsz=(1, 3, *imgsz), half=False) # warmup + self.img_size: List[int] = imgsz + + def extract_feats(self, x): + """ + return the feature maps before sigmoid for mining + """ + return self.model.model(x)[1] + + def forward(self, x, nms=False): + pred = self.model(x) + if not nms: + return pred + + pred = non_max_suppression(pred, + conf_thres=self.conf_thres, + iou_thres=self.iou_thres, + classes=None, # not filter class_idx + agnostic=False, + max_det=100) + return pred + + def init_detector(self, device: torch.device) -> DetectMultiBackend: + weights = get_weight_file(self.cfg) + + if not weights: + raise Exception("no weights file specified!") + + data_yaml = osp.join(self.cfg.ymir.output.root_dir, 'data.yaml') + model = DetectMultiBackend( + weights=weights, + device=device, + dnn=False, # not use opencv dnn for onnx inference + data=data_yaml) # dataset.yaml path + + return model + + def predict(self, img: CV_IMAGE) -> NDArray: + """ + predict single image and return bbox information + img: opencv BGR, uint8 format + """ + # preprocess: padded resize + img1 = letterbox(img, self.img_size, stride=self.stride, auto=True)[0] + + # preprocess: convert data format + img1 = img1.transpose((2, 0, 1))[::-1] # HWC to CHW, BGR to RGB + img1 = np.ascontiguousarray(img1) + img1 = torch.from_numpy(img1).to(self.device) + + img1 = img1 / 255 # 0 - 255 to 0.0 - 1.0 + img1.unsqueeze_(dim=0) # expand for batch dim + pred = self.forward(img1, nms=True) + + result = [] + for det in pred: + if len(det): + # Rescale boxes from img_size to img size + det[:, :4] = scale_coords(img1.shape[2:], det[:, :4], img.shape).round() + result.append(det) + + # xyxy, conf, cls + if len(result) > 0: + tensor_result = torch.cat(result, dim=0) + numpy_result = tensor_result.data.cpu().numpy() + else: + numpy_result = np.zeros(shape=(0, 6), dtype=np.float32) + + return numpy_result + + def infer(self, img: CV_IMAGE) -> List[rw.Annotation]: + anns = [] + result = self.predict(img) + + for i in range(result.shape[0]): + xmin, ymin, xmax, ymax, conf, cls = result[i, :6].tolist() + ann = rw.Annotation(class_name=self.class_names[int(cls)], + score=conf, + box=rw.Box(x=int(xmin), y=int(ymin), w=int(xmax - xmin), h=int(ymax - ymin))) + + anns.append(ann) + + return anns + + def write_monitor_logger(self, stage: YmirStage, p: float): + monitor.write_monitor_logger( + percent=get_ymir_process(stage=stage, p=p, task_idx=self.task_idx, task_num=self.task_num)) + + +def convert_ymir_to_yolov5(cfg: edict, out_dir: str = None): + """ + convert ymir format dataset to yolov5 format + generate data.yaml for training/mining/infer + """ + + out_dir = out_dir or cfg.ymir.output.root_dir + data = dict(path=out_dir, nc=len(cfg.param.class_names), names=cfg.param.class_names) + for split, prefix in zip(['train', 'val', 'test'], ['training', 'val', 'candidate']): + src_file = getattr(cfg.ymir.input, f'{prefix}_index_file') + if osp.exists(src_file): + shutil.copy(src_file, f'{out_dir}/{split}.tsv') + + data[split] = f'{split}.tsv' + + with open(osp.join(out_dir, 'data.yaml'), 'w') as fw: + fw.write(yaml.safe_dump(data)) diff --git a/docs/mining-images-overview.md b/docs/mining-images-overview.md new file mode 100644 index 0000000..cec5f86 --- /dev/null +++ b/docs/mining-images-overview.md @@ -0,0 +1,21 @@ +# ymir mining images overview + +| docker images | random | cald | aldd | entropy | +| - | - | - | - | - | +| yolov5 | ✔️ | ✔️ | ✔️ | ✔️ | +| mmdetection | ✔️ | ✔️ | ✔️ | ❌ | +| yolov4 | ❌ | ✔️ | ✔️ | ❌ | +| yolov7 | ❌ | ❌ | ✔️ | ❌ | +| nanodet | ❌ | ❌ | ✔️ | ❌ | +| vidt |❌ | ✔️ | ❌ | ❌ | +| detectron2 | ❌ | ✔️ | ❌ | ❌ | + +view [ALBench: Active Learning Benchmark](https://github.com/modelai/ALBench) for detail + +## reference + +- entropy: `Multi-class active learning for image classification. CVPR 2009` + +- cald: `Consistency-based Active Learning for Object Detection. CVPR 2022 workshop` + +- aldd: `Active Learning for Deep Detection Neural Networks. ICCV 2019` diff --git a/docs/official-docker-image.md b/docs/official-docker-image.md new file mode 100644 index 0000000..a01a91a --- /dev/null +++ b/docs/official-docker-image.md @@ -0,0 +1,61 @@ +# official docker image + +- [yolov4](https://github.com/modelai/ymir-executor-fork#det-yolov4-training) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-yolov4-cu112-tmi + + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-yolov4-cu101-tmi + ``` + +- [yolov5](https://github.com/modelai/ymir-executor-fork#det-yolov5-tmi) + + - [change log](./det-yolov5-tmi/README.md) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-yolov5-cu111-tmi + + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-yolov5-cu102-tmi + ``` + +- [mmdetection](https://github.com/modelai/ymir-executor-fork#det-mmdetection-tmi) + + - [change log](./det-mmdetection-tmi/README.md) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-mmdet-cu111-tmi + + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-mmdet-cu102-tmi + ``` + +- [detectron2](https://github.com/modelai/ymir-detectron2) + + - [change log](https://github.com/modelai/ymir-detectron2/blob/master/README.md) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-detectron2-cu111-tmi + ``` + +- [yolov7](https://github.com/modelai/ymir-yolov7) + + - [change log](https://github.com/modelai/ymir-yolov7/blob/main/ymir/README.md) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-yolov7-cu111-tmi + ``` + +- [vidt](https://github.com/modelai/ymir-vidt) + + - [change log](https://github.com/modelai/ymir-vidt/tree/main/ymir) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-vidt-cu111-tmi + ``` + +- [nanodet](https://github.com/modelai/ymir-nanodet/tree/ymir-dev) + + - [change log](https://github.com/modelai/ymir-nanodet/tree/ymir-dev/ymir) + + ``` + docker pull youdaoyzbx/ymir-executor:ymir1.1.0-nanodet-cu111-tmi + ``` diff --git a/docs/ymir-docker-develop.drawio.png b/docs/ymir-docker-develop.drawio.png new file mode 100644 index 0000000..706a95e Binary files /dev/null and b/docs/ymir-docker-develop.drawio.png differ diff --git a/docs/ymir-executor-version.md b/docs/ymir-executor-version.md new file mode 100644 index 0000000..1c1c30f --- /dev/null +++ b/docs/ymir-executor-version.md @@ -0,0 +1,19 @@ +# ymir1.3.0 (2022-09-30) + +- 支持分开输出模型权重,用户可以采用epoch10.pth进行推理,也可以选择epoch20.pth进行推理 + +- 训练镜像需要指定数据集标注格式, ymir1.1.0默认标注格式为`ark:raw` + +- 训练镜像可以获得系统的ymir接口版本,方便镜像兼容 + +## 辅助库 + +- [ymir-executor-sdk](https://github.com/modelai/ymir-executor-sdk) 采用ymir1.3.0分支 + +- [ymir-executor-verifier](https://github.com/modelai/ymir-executor-verifier) 镜像检查工具 + +# ymir1.1.0 + +- [custom ymir-executor](https://github.com/IndustryEssentials/ymir/blob/dev/dev_docs/ymir-dataset-zh-CN.md) + +- [ymir-executor-sdk](https://github.com/modelai/ymir-executor-sdk) 采用ymir1.0.0分支 diff --git a/live-code-executor/img-man/training-template.yaml b/live-code-executor/img-man/training-template.yaml index 865b40b..df87016 100644 --- a/live-code-executor/img-man/training-template.yaml +++ b/live-code-executor/img-man/training-template.yaml @@ -6,3 +6,5 @@ gpu_id: '0' task_id: 'default-training-task' pretrained_model_params: [] class_names: [] +export_format: 'ark:raw' +shm_size: '128G' diff --git a/live-code-executor/mxnet.dockerfile b/live-code-executor/mxnet.dockerfile index e04bd4b..ed08fff 100644 --- a/live-code-executor/mxnet.dockerfile +++ b/live-code-executor/mxnet.dockerfile @@ -15,7 +15,8 @@ ENV PATH /opt/conda/bin:$PATH # install linux package, needs to fix GPG error first. RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC && \ apt-get update && \ - apt-get install -y git gcc wget curl zip libglib2.0-0 libgl1-mesa-glx && \ + apt-get install -y git gcc wget curl zip libglib2.0-0 libgl1-mesa-glx \ + libsm6 libxext6 libxrender-dev build-essential && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* && \ wget "${MINICONDA_URL}" -O miniconda.sh -q && \ diff --git a/live-code-executor/torch.dockerfile b/live-code-executor/torch.dockerfile index a71476f..4fd9a90 100644 --- a/live-code-executor/torch.dockerfile +++ b/live-code-executor/torch.dockerfile @@ -15,7 +15,8 @@ ENV LANG=C.UTF-8 # install linux package RUN apt-get update && apt-get install -y git curl wget zip gcc \ - libglib2.0-0 libgl1-mesa-glx \ + libglib2.0-0 libgl1-mesa-glx libsm6 libxext6 libxrender-dev \ + build-essential \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* diff --git a/live-code-executor/ymir_start.py b/live-code-executor/ymir_start.py index d2c5415..ee81336 100644 --- a/live-code-executor/ymir_start.py +++ b/live-code-executor/ymir_start.py @@ -50,7 +50,13 @@ def main(): logger.info('no python package needs to install') # step 3. run /app/start.py - cmd = 'python3 start.py' + if osp.exists('/app/start.py'): + cmd = 'python3 start.py' + elif osp.exists('/app/ymir/start.py'): + cmd = 'python3 ymir/start.py' + else: + raise Exception('cannot found start.py') + logger.info(f'run task: {cmd}') subprocess.run(cmd.split(), check=True, cwd='/app')