diff --git a/week01/Lecture_1.pptx b/week01/Lecture_1.pptx index e314811..56b8ba2 100644 Binary files a/week01/Lecture_1.pptx and b/week01/Lecture_1.pptx differ diff --git a/week01/README.md b/week01/README.md index 24eabf7..0c49f73 100644 --- a/week01/README.md +++ b/week01/README.md @@ -1,21 +1,14 @@ -# Lecture 1: Introduction to the class and Nvidia Jetson Device (Nano, Nano 4G or NX) +# Lecture 1: Introduction to the class and deep learning with transformers Cloud computing. Big Data. Artificial intelligence. Deep learning frameworks and hardware. Datasets. Edge computing. Course project overview and sample. It is very important that you have all the required equipment ready before the first sync session. ## Reading: -* Nvidia Jetpack -https://developer.nvidia.com/embedded/jetpack -* Nvidia Jetson family overview -https://www.nvidia.com/en-us/autonomous-machines/embedded-systems-dev-kits-modules/ -* Jetson NX intro -https://developer.nvidia.com/embedded/jetson-xavier-nx-devkit -* Jetson Nano 2G User Guide -https://developer.nvidia.com/embedded/learn/jetson-nano-2gb-devkit-user-guide -* [Jetson Nano 4G User Guide](https://developer.nvidia.com/jetson-xavier-nx-developer-kit-user-guide) -* NGC Containers for Linux for Tegra (l4t) -https://catalog.ngc.nvidia.com/containers?filters=&orderBy=dateModifiedDESC&query=l4t +* Introduction: Hugging Face +https://huggingface.co/course/chapter0/1?fw=pt +* Transformer models +https://huggingface.co/course/chapter1/1?fw=pt * Two days to a demo (skim through) https://developer.nvidia.com/embedded/twodaystoademo * Cloud 001: Introduction to SSH diff --git a/week01/hw/CreateUbuntuVMInVMware.mp4 b/week01/hw/CreateUbuntuVMInVMware.mp4 deleted file mode 100644 index a5f445a..0000000 Binary files a/week01/hw/CreateUbuntuVMInVMware.mp4 and /dev/null differ diff --git a/week01/hw/Readme.md b/week01/hw/Readme.md index 2b6c010..af30ff7 100644 --- a/week01/hw/Readme.md +++ b/week01/hw/Readme.md @@ -1,471 +1,25 @@ -# HW 1: Installing JetPack and Docker +# HW 1: Setting up workspace with Hugging Face 🤗 +This is just a warm-up, things will get harder soon ;-) As a result of this homework, we want you to get comfortable with Hugging Face 🤗 environment so that over the next week you can use it as a platform for building, training/fine-tuning, deploying and running your models. -## 1. Nvidia JetPack SDK -JetPack is an SDK that basically contains everything needed for deep learning and AI applications in a handy package bundle containing the OS for for the Nano. Installation on the Nano requires downloading and flashing the image to a MicroSD card. - -Due to supply shortages, we are recommending the Jetson Nano Developer Kit 4GB model (will be provided by the school free of charge) or the Jetson Xavier NX Developer Kit. If you are able to find a reasonably priced NX, feel free to use that instead of the Nano as it is by far the more powerful device. The original pricing on it is $399; there's no need to pay much more than that. Also, a 2GB Nano would work, although it would be the least preferred option due to the very small memory footprint. - - -So, you will need the following: - - 1. [Jetson Nano Developer Ki: 4GB](https://shop.nvidia.com/en-us/jetson/store/?page=1&limit=9&locale=en-us) - but will be provided free of charge. - 2. [128GB Micro SD](https://www.amazon.com/gp/product/B07G3H5RBT/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1) - 3. USB MicroSD card reader - 4. [WiFi/Bluetooth card](https://www.amazon.com/dp/B085M7VPDP?psc=1&ref=ppx_yo2_dt_b_product_details) - 5. [Power adapter](https://www.amazon.com/gp/product/B08DXZ1MSY/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1) ### NOTE: ensure you set the jumper when using the power adapter. Reference [here](https://www.jetsonhacks.com/2019/04/10/jetson-nano-use-more-power/). Note: if you are using [the Nano 2GB](https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-2gb-devkit), check out the [list of supported components](https://developer.nvidia.com/embedded/learn/jetson-nano-2gb-devkit-user-guide#id-.JetsonNano2GBDeveloperKitUserGuidevbatuu_v1.0-SupportedComponentList). You could get [this 3.5A power supply.](https://www.amazon.com/CanaKit-Raspberry-Power-Supply-USB-C/dp/B07TYQRXTK/ref=sr_1_8?dchild=1&keywords=18W+power+supply+usb+c&qid=1595554849&sr=8-8) - 6. [1TB USB3 SSD](https://www.amazon.com/Samsung-T5-Portable-SSD-MU-PA2T0B/dp/B073H552FJ/ref=sr_1_3) ### NOTE: mount to the normal USB port; the USB-C port is needed for the power supply. - 7. [USB Webcam](https://www.amazon.com/Logitech-960-000637-HD-Webcam-C310/dp/B003PAIV2Q/ref=sr_1_6) - 8. You will also typically need a USB 3.0 hub (ideally, a powered one), a mouse, a keyboard, and a monitor. - -If you are able to find a Jetson NX, the following is needed: - 1. MicroSD card (64GB minimum size) - 2. USB MicroSD card reader - 3. NVMe M.2 SSD (256GB minimum size) **NOTE: SATA M.2 SSDs will not work** - 4. Size 0 Philips head screwdriver - 5. Micro USB to USB cable - 6. USB Webcam - -### 1.1 Host (Computer) Installation - -On your Windows, Mac, or Ubuntu workstation, navigate to the [JetPack homepage](https://developer.nvidia.com/jetpack) (**NOTE that we shold be using JetPack 4.6.1 for this class as it is ATM the latest production version**) and click on "Download SD Card Image" in the `JETSON XAVIER NX DEVELOPER KIT` box. Once it downloads, follow the steps at [Getting Started with Jetson Xavier NX Developer Kit](https://developer.nvidia.com/embedded/learn/get-started-jetson-xavier-nx-devkit) to flash the SD card. - -NVIDIA provides [flashing instructions](https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit#write) for Windows, Linux, and Mac. You will need to insert the MicroSD card into the card reader and connect it to the USB port on your computer. - - -A quick video showing MicroSD card [here](Xavier_NX_Install_SSD.mp4). Note, this video is for a Jetson Xavier NX and includes the installation of a SSD. - -Once the flashing process is complete, you will insert the MicroSD card into your Jetson. **Do not power it on yet.** - - - - -### 1.2 Post-flash setup - -There are two setup options. - - 1. Use a USB keyboard/mouse and HDMI display - 2. Use a tty terminal from Mac or Linux - -With the first option, you will obviously need additional hardware. Option number two is quite easy, though, and we will walk you through the process. - -If you choose option two, you will need a machine running Linux (we recommend Ubuntu 18.04), or a Mac. If you do not have one, you can create a VM running Ubuntu (see section 1.2). - -If you are using install option one, you can connect the keyboard, mouse, and display to [complete the setup process](https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit#setup). Once they are connected, you can connect the power adapter. Follow the steps to complete the setup and then go to section 2. If you have issues connecting to wifi, skip that step and connect the Nano directly to your router with an ethernet cable **after** the setup is complete. - -If you are using install option two, you can connect the Nano to your Mac or Linux computer using the micro-USB cable and then connect the power adapter. - -**If you are using a VMware VM, you will be prompted to connect the USB device to your host computer or the VM; choose the VM. When in the VM, use "lsusb" in the terminal to check if the Jetson is visible.** - -### 1.3 If you chose Option 2 in section 1.2 - -You will need to use a Linux VM or a Mac to perform these steps. - -#### 1.3.1 Create a VM (skip this step if you are using a Mac) - -You get a free VMware subscription through Berkeley [here](https://software.berkeley.edu/vmware). Download and install VMware Workstation (for Windows) or VMware Fusion (for macOS). - -Download the Ubuntu 18.04 iso image [here](http://releases.ubuntu.com/18.04/ubuntu-18.04.3-desktop-amd64.iso). - -Create a new VM in VMware. - -Walk-through on VMware image creation is [here](CreateUbuntuVMInVMware.mp4). - -**VM Configuration**, the size of the disk should be 40GB absolutely minimum. Give the VM 2-4 cores to make sure cross-compilation does not take forever, and at least 4-8G of RAM. - - -#### 1.3.2 Mac: -Run this command from the Mac Terminal application: - -``` -ls -ls /dev/cu.* -``` - -You should see a `usbmodem` device like: - -``` -/dev/cu.usbmodem14210200096973 -``` - -You will use the `screen` command to connect to the tty device: - -``` -screen /dev/cu.usbmode* 115200 -L -``` - -#### 1.3.3 Linux: -Run this command from the Linux terminal application: - -``` -ls -ls /dev/ttyA* -``` - -You should see a `ttyACM` device like: - -``` -/dev/ttyACM0 -``` -You will use the `screen` command to connect to the tty device: - -``` -sudo apt-get update -sudo apt-get install -y screen -sudo screen /dev/ttyACM0 115200 -L -``` - -### 1.4 Both Linux and Mac: - -You will finish the [setup process](https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit#setup) using the tty terminal you just opened to the device. - -## 2. Configure VNC - -It is highly recommended that you connect your Nano directly to your router with an ethernet cable. - -You can have a keyboard, mouse, and monitor attached to your Jetson; but it is also extremely convenient to set up screen sharing, so you can see the Jetson desktop remotely. This is needed, for instance, when you want to show Jetson's screen over a web conference - plus it's a lot easier than switching between monitors all the time. - -1. Get a VNC screen sharing client. You can install [TightVNC](https://www.tightvnc.com/), [Remmina](https://remmina.org/), or another VNC client of your choice. -2. Configure your Nano for remote screen sharing. - -On your Nano, open a terminal (or ssh to your Nano from another computer). - -``` -mkdir ~/.config/autostart -``` -* Now, create/open the file ```~/.config/autostart/vino-server.desktop``` and insert the following: - -``` -[Desktop Entry] -Type=Application -Name=Vino VNC server -Exec=/usr/lib/vino/vino-server -NoDisplay=true -``` - -* Disable security by running the following from the command line: - -``` -gsettings set org.gnome.Vino prompt-enabled false -gsettings set org.gnome.Vino require-encryption false -``` - -* Enable automatic login by editing /etc/gdm3/custom.conf and add the following (the AutomaticLogin ID should be the user ID you created in the setup): - -``` -# Enabling automatic login - AutomaticLoginEnable = true - AutomaticLogin = nvidia # Ensure that you replace 'nvidia' with the ID you use to login to your Nano -``` - - -* Reboot your Nano -* Then, launch your remote sharing client, choose VNC as the protocol, type in the IP address of your jetson and port 5900. - -**NOTE:** -To find your IP address, use the following command: - -``` -nvidia@nano:~$ ip addr show | grep inet - inet 127.0.0.1/8 scope host lo - inet6 ::1/128 scope host - inet 192.168.11.103/24 brd 192.168.11.255 scope global dynamic noprefixroute eth0 - inet6 fe80::4ab0:2dff:fe05:a700/64 scope link noprefixroute -nvidia@nano:~$ -``` -The IP address in this example is on the third line: `192.168.11.103`. - -When using VNC it is strongly recommended to us a reslution less than 4k as resolutions at 4k or higher can cause additional lag. -For example, a resolution of 1600x900 typically decent performance (you may adjust as needed). - -Make sure your display cable is not plugged into your Nano (if it is, unplug it and reboot) and from a SSH shell enter: -``` -export DISPLAY=:0 -xhost + -sudo xrandr --fb 1600x900 -``` - -You'll need to run this after you reboot your Nano. - -### Now run a VNC viewer on your computer (not the Jetson): - -On any platform, you can download a VNC Viewer (like Real VNC) and use it: - -![vnc2](vnc2.png) -and - -![vnc1](vnc1.png) - -On Linux, you can use Remmina: - -![remmina](remmina2.png) - -* The default resolution is very small. You can change it with this command (required after every reboot): - -``` -sudo xrandr --fb 1600x900 # you can choose some other resolution if desired -``` - - - - -### Testing JetPack on the Nano -Ensure the Nano is on and running Ubuntu. Use this command to verify that everything is happy and healthy: - -``` -sudo nvpmodel -q --verbose -``` - -The output should be similar to: - -``` -NVPM VERB: Config file: /etc/nvpmodel.conf -NVPM VERB: parsing done for /etc/nvpmodel.conf -NVPM WARN: fan mode is not set! -NVPM VERB: Current mode: NV Power Mode: MAXN -0 -NVPM VERB: PARAM CPU_ONLINE: ARG CORE_0: PATH /sys/devices/system/cpu/cpu0/online: REAL_VAL: 1 CONF_VAL: 1 -NVPM VERB: PARAM CPU_ONLINE: ARG CORE_1: PATH /sys/devices/system/cpu/cpu1/online: REAL_VAL: 1 CONF_VAL: 1 -NVPM VERB: PARAM CPU_ONLINE: ARG CORE_2: PATH /sys/devices/system/cpu/cpu2/online: REAL_VAL: 1 CONF_VAL: 1 -NVPM VERB: PARAM CPU_ONLINE: ARG CORE_3: PATH /sys/devices/system/cpu/cpu3/online: REAL_VAL: 1 CONF_VAL: 1 -NVPM VERB: PARAM CPU_A57: ARG MIN_FREQ: PATH /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq: REAL_VAL: 102000 CONF_VAL: 0 -NVPM VERB: PARAM CPU_A57: ARG MAX_FREQ: PATH /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: REAL_VAL: 1479000 CONF_VAL: 2147483647 -NVPM VERB: PARAM GPU_POWER_CONTROL_ENABLE: ARG GPU_PWR_CNTL_EN: PATH /sys/devices/gpu.0/power/control: REAL_VAL: auto CONF_VAL: on -NVPM VERB: PARAM GPU: ARG MIN_FREQ: PATH /sys/devices/gpu.0/devfreq/57000000.gpu/min_freq: REAL_VAL: 76800000 CONF_VAL: 0 -NVPM VERB: PARAM GPU: ARG MAX_FREQ: PATH /sys/devices/gpu.0/devfreq/57000000.gpu/max_freq: REAL_VAL: 921600000 CONF_VAL: 2147483647 -NVPM VERB: PARAM GPU_POWER_CONTROL_DISABLE: ARG GPU_PWR_CNTL_DIS: PATH /sys/devices/gpu.0/power/control: REAL_VAL: auto CONF_VAL: auto -NVPM VERB: PARAM EMC: ARG MAX_FREQ: PATH /sys/kernel/nvpmodel_emc_cap/emc_iso_cap: REAL_VAL: 0 CONF_VAL: 0 -NVPM VERB: PARAM CVNAS: ARG MAX_FREQ: PATH /sys/kernel/nvpmodel_emc_cap/nafll_cvnas: REAL_VAL: 576000000 CONF_VAL: 576000000 - -``` - -### Exploring the power modes of the Nano -The Jetson line of SoCs (including the Nano) has a number of different power modes described in some detail here: [TX2](https://www.jetsonhacks.com/2017/03/25/nvpmodel-nvidia-jetson-tx2-development-kit/) or [Xavier](https://www.jetsonhacks.com/2018/10/07/nvpmodel-nvidia-jetson-agx-xavier-developer-kit/). The main idea is that the lowering clock speeds on the cpu and turning off cores saves energy; and the default power mode is a low energy mode. You need to switch to a higher power mode to use all cores and maximize the clock frequency. In the upper right corner of your desktop you will see a widget that should allow you to switch between power modes. Set your power mode to MAXN; this will enable all cores and will maximize your clock frequency. This is ok when we use our Nano as a small desktop computer. If you decide to use your Nano as a robotic device and become worried about the power draw, you may want to lower this setting. - -## 3. Prepare the SSD - -### 3.1 Configure a USB-attached SSD (both Nano and Xavier) -### Note: If you have a NVMe SSD on your Xavier, skip to section 3.2 - -Run `lsblk` to find the SSD. - -The output will show all your block devices. Look under the SIZE column for the correct size device (465.8G in the example below). Note the device name (sda in the example below). Your output should show a / in the MOUNTPOINT column of the mmcblk0p1 line: - -``` -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -loop0 7:0 0 16M 1 loop -sda 8:0 0 465.8G 0 disk -mtdblock0 31:0 0 32M 0 disk -mmcblk0 179:0 0 59.5G 0 disk -├─mmcblk0p1 179:1 0 59.2G 0 part / -├─mmcblk0p2 179:2 0 64M 0 part -├─mmcblk0p3 179:3 0 64M 0 part -├─mmcblk0p4 179:4 0 448K 0 part -├─mmcblk0p5 179:5 0 448K 0 part -├─mmcblk0p6 179:6 0 63M 0 part -├─mmcblk0p7 179:7 0 512K 0 part -├─mmcblk0p8 179:8 0 256K 0 part -├─mmcblk0p9 179:9 0 256K 0 part -├─mmcblk0p10 179:10 0 100M 0 part -└─mmcblk0p11 179:11 0 18K 0 part -zram0 252:0 0 494.5M 0 disk [SWAP] -zram1 252:1 0 494.5M 0 disk [SWAP] -zram2 252:2 0 494.5M 0 disk [SWAP] -zram3 252:3 0 494.5M 0 disk [SWAP] -``` - -To setup the SSD, run the following commands: - -``` -# Wipe the SSD -sudo wipefs --all --force /dev/sda - -# Partition the SSD -sudo parted --script /dev/sda mklabel gpt mkpart primary ext4 0% 100% - -# Format the newly created partition -sudo mkfs.ext4 /dev/sda1 - -# Create the fstab entry -echo "/dev/sda1 /data ext4 defaults 0 1" | sudo tee -a /etc/fstab - -# Mount the ssd and set the permissions -mkdir /data -mount /data -chmod go+rwx /data - -# Move the Docker repo to /data -sudo systemctl stop docker -sudo mv /var/lib/docker /data/ -sudo ln -s /data/docker/ /var/lib/docker -sudo systemctl start docker - -# Verify that Docker re-started -sudo systemctl status docker - -``` - -Continue to section 3.3 to set up the swap space. - -### 3.2 Configure Operating System to run from SSD (Xavier NX with a NVMe ONLY) - -### Note: It is advised to run lsblk after each reboot to ensure that the Jetson is using the correct boot device. - -Steps: - -### Note, with 4.6, there may be times when the Jetson fails to use the attached SSD as the root file system. You can check this by running `lsblk` and confirmning the SD card is not using / as a mount point. A reboot seems to correct this. - - -Follow the instructions on [this page](https://www.jetsonhacks.com/2020/05/29/jetson-xavier-nx-run-from-ssd/) (watch the video carefully). - -# WARNING: This is a destructive process and will wipe your SSD. -### Note: This version is for an Xavier NX with a NVMe SSD located at /dev/nvme0n1, which is the standard device location - -Steps: - -Verify that the OS is booting from the Micro SD. - -``` -lsblk -``` - -Your output should show a `/` in the MOUNTPOINT column of the `mmcblk0p1` line: - -``` -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -loop0 7:0 0 16M 1 loop -mtdblock0 31:0 0 32M 0 disk -mmcblk0 179:0 0 59.5G 0 disk -├─mmcblk0p1 179:1 0 59.2G 0 part / -├─mmcblk0p2 179:2 0 64M 0 part -├─mmcblk0p3 179:3 0 64M 0 part -├─mmcblk0p4 179:4 0 448K 0 part -├─mmcblk0p5 179:5 0 448K 0 part -├─mmcblk0p6 179:6 0 63M 0 part -├─mmcblk0p7 179:7 0 512K 0 part -├─mmcblk0p8 179:8 0 256K 0 part -├─mmcblk0p9 179:9 0 256K 0 part -├─mmcblk0p10 179:10 0 100M 0 part -└─mmcblk0p11 179:11 0 18K 0 part -zram0 252:0 0 1.9G 0 disk [SWAP] -zram1 252:1 0 1.9G 0 disk [SWAP] -nvme0n1 259:0 0 465.8G 0 disk -``` - -To setup the SSD: - -``` -# Wipe the SSD -sudo wipefs --all --force /dev/nvme0n1 - -# Partition the SSD -sudo parted --script /dev/nvme0n1 mklabel gpt mkpart primary ext4 0% 100% - -# Format the newly created partition -sudo mkfs.ext4 /dev/nvme0n1p1 - -# We will use the jetsonhacks scripts to move data and enable the SSD as -# the default disk - -git clone https://github.com/jetsonhacks/rootOnNVMe.git -cd rootOnNVMe/ -sudo ./copy-rootfs-ssd.sh -./setup-service.sh - -# Reboot for the update to take effect -sudo reboot -``` - -Run the `lsblk` command again to verify that you are running the OS from the SSD. This time, the `/` should be the MOUNTPOINT for `nvme0n1p1`: - - -``` -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -loop0 7:0 0 16M 1 loop -mtdblock0 31:0 0 32M 0 disk -mmcblk0 179:0 0 59.5G 0 disk -├─mmcblk0p1 179:1 0 59.2G 0 part /media/nvidia/48fc8f75-dc2b-4a68-9673-c4cc26f9d5db -├─mmcblk0p2 179:2 0 64M 0 part -├─mmcblk0p3 179:3 0 64M 0 part -├─mmcblk0p4 179:4 0 448K 0 part -├─mmcblk0p5 179:5 0 448K 0 part -├─mmcblk0p6 179:6 0 63M 0 part -├─mmcblk0p7 179:7 0 512K 0 part -├─mmcblk0p8 179:8 0 256K 0 part -├─mmcblk0p9 179:9 0 256K 0 part -├─mmcblk0p10 179:10 0 100M 0 part -└─mmcblk0p11 179:11 0 18K 0 part -zram0 252:0 0 1.9G 0 disk [SWAP] -zram1 252:1 0 1.9G 0 disk [SWAP] -nvme0n1 259:0 0 465.8G 0 disk -└─nvme0n1p1 259:1 0 465.8G 0 part / -``` - -### 3.3 Set up swap (both Nano and Xavier) - -Use the `configure_jetson.sh` script in this repo to set up swap space after you have rebooted and verified that you are running your Operating System from the SSD: - -``` -git clone https://github.com/MIDS-scaling-up/v3.git -cd v3/week01/hw -chmod +x configure_jetson.sh -./configure_jetson.sh -``` - -Install jtop (a monitoring tool from https://github.com/rbonghi/jetson_stats): - -``` -sudo apt update -sudo apt install -y python3-pip -sudo -H pip3 install -U jetson-stats -sudo reboot - -# Test after reboot -jtop -``` - - -## 4. Docker -Docker is a platform that allows you to create, deploy, and run applications in containers. The application and all its dependecies are packaged into one container that is easy to ship out and uses the same Linux kernel as the system it's running on, unlike a virtual machine. This makes it especially useful for compact platforms such as the Jetson. - -JetPack 4.3+ has Docker pre-installed, and has an experimental nvidia-docker support. - -Let's test it to see if it can run containers. Since the Jetson doesn't have the docker image 'hello-world' downloaded yet, Docker will automatically pull it online from the official repository: - -``` -docker run hello-world -``` - -Note, if you get a permissions error, run this command: -``` -sudo usermod -aG docker $USER -``` -Log out and log back in so that your group membership is re-evaluated. +## 1. HuggingFaces account +Create an account on HuggingFaces portal, unless you already have one. +## 2. Google Colab +1. If you are note yet familiar with Google Colab, check out https://colab.research.google.com/notebooks/intro.ipynb +2. Follow steps from https://huggingface.co/course/chapter0/1?fw=pt to make sure that your environment is working -### Run the base Docker Image for the Jetson -Most of the work in the class will require a Docker base image running Ubuntu 18.04 with all the needed dependencies. For the first time, in July 2019, Nvidia has released an officially supported base cuda container! Please register at the [Nvidia GPU Cloud](http://ngc.nvidia.com) and review the documentation for the [base jetson container](https://ngc.nvidia.com/catalog/containers/nvidia:l4t-base) - -Let's start this container: - -``` -# allow remote X connections -xhost + -# assuming that r32.4.3 is the latest version; but please check the NGC -docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.4.4 -# this should complete successfully. Run this command to verify that you are inside the Docker container - -ls -# You should see the following: -# bin boot dev dst etc home lib media mnt opt proc root run sbin srv sys tmp usr var - -# Now exit from the container: -exit -``` -More on the use of this container is [here](https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-Container-Runtime-on-Jetson) - -Note that our own old Docker images for the Jetsons are still available in the [docker hub](https://cloud.docker.com/u/w251/), e.g. ```w251/cuda:tx2-4.3_b132``` and ```w251/cuda:dev-tx2-4.3_b132```. As the officially supported containers mature, we expect to sunset these; but for now, we'll keep them around just in case. We keep Dockerfiles for all of our containers [here](https://github.com/MIDS-scaling-up/v2/tree/master/backup) for your reference. +## 3. Transformer models +1. If you don't have prior familiarity with transformers, read through the materials at https://huggingface.co/course/chapter1/1?fw=pt +2. Comlete quiz at https://huggingface.co/course/chapter1/10?fw=pt -We'll cover Docker during the in-class lab in more detail. +## 4. Hugging Face 🤗 interface +1. Pick one of demos that are hosted on HuggingFaces platform. Suggestion: https://huggingface.co/spaces/stabilityai/stable-diffusion +2. Play with different kinds of inputs and critically think about the results of inference (for example, when you provide certain prompt and it generates output images, is it what you expected to see? why and why not? does this hint about potential biases of the model? what are limitations of testing the model via this UI? etc.) +## 5. Papers With Code +1. Explore https://paperswithcode.com/. Understand how the portal is structured, navigation, content curation, etc. +2. Compare and contrast the organization of HuggingFace and PapersWithCode. What would you be using each for? How do they commpliment each other for a data scientist? -# To turn in -Please send a message on the class portal homework submission page indicating that you were able to set up your Jetson +# Turn In +To turn in, submit a pdf of your 3.2 quiz results and a free-text summary of your 4.2 exercise (no page limit, you're fine as long as the mini-report shows that you have a good understanding of what the model is supposed to do and what it actually did on the samples that you provided to it). For extra points, submit your thoughts on 5.2! diff --git a/week01/hw/Xavier_NX_Install_SSD.mp4 b/week01/hw/Xavier_NX_Install_SSD.mp4 deleted file mode 100644 index 68abd7a..0000000 Binary files a/week01/hw/Xavier_NX_Install_SSD.mp4 and /dev/null differ diff --git a/week01/hw/configure_jetson.sh b/week01/hw/configure_jetson.sh deleted file mode 100644 index 19f5304..0000000 --- a/week01/hw/configure_jetson.sh +++ /dev/null @@ -1,17 +0,0 @@ -#!/bin/sh - -# Add user to docker group to avoid sudo -sudo usermod -aG docker $USER - -# Turn of zram swap -sudo mv /etc/systemd/nvzramconfig.sh /etc/systemd/nvzramconfig.sh.save - -# Create and enable a 32 GB swap space -sudo mkdir /data -sudo fallocate -l 36G /data/swapfile -sudo chmod 600 /data/swapfile -sudo mkswap /data/swapfile -sudo swapon /data/swapfile -sudo swapon -s -echo "/data/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab - diff --git a/week01/hw/remmina2.png b/week01/hw/remmina2.png deleted file mode 100644 index b3197ff..0000000 Binary files a/week01/hw/remmina2.png and /dev/null differ diff --git a/week01/hw/vnc1.png b/week01/hw/vnc1.png deleted file mode 100644 index f6e0a89..0000000 Binary files a/week01/hw/vnc1.png and /dev/null differ diff --git a/week01/hw/vnc2.png b/week01/hw/vnc2.png deleted file mode 100644 index b05b3a5..0000000 Binary files a/week01/hw/vnc2.png and /dev/null differ diff --git a/week01/lab/Dockerfile.yolov5 b/week01/lab/Dockerfile.yolov5 deleted file mode 100644 index fd7d051..0000000 --- a/week01/lab/Dockerfile.yolov5 +++ /dev/null @@ -1,33 +0,0 @@ -FROM nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3 - -# tested on Jetson NX - -# Create working directory -RUN mkdir -p /usr/src/app -WORKDIR /usr/src/app - -RUN apt update && apt install -y libssl-dev -RUN pip3 install -U pip -# Copy contents -# COPY . /usr/src/app -RUN git clone https://github.com/ultralytics/yolov5 --branch v3.0 - -WORKDIR /usr/src/app/yolov5 - -# Install dependencies (pip or conda) -# RUN pip3 install -r requirements.txt - - -RUN apt update && apt install -y libffi-dev python3-pip curl unzip python3-tk libopencv-dev python3-opencv -RUN pip3 install -U gsutil pyyaml tqdm cython #torchvision -RUN apt install -y python3-scipy python3-matplotlib python3-numpy -RUN pip3 install git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI - -# RUN pip3 install requests -# RUN apt install -y python3-pandas -# RUN pip3 install seaborn -RUN pip3 install -U pip -RUN pip list -RUN pip3 install -r requirements.txt - - diff --git a/week01/lab/Readme.md b/week01/lab/Readme.md deleted file mode 100644 index b4fe670..0000000 --- a/week01/lab/Readme.md +++ /dev/null @@ -1,68 +0,0 @@ -# Lab 1: More Docker and YOLO - -Docker is great for projects since you can place and run everything you need in a container through a easy, reproducible process. We'll use Docker to build and run our very own container with Darknet and YOLO. - -This lab is run on the Jetson device using the desktop (via VNC or display). - -Ensure that you cloned this github repo and are in the directory for this lab (v2/week01/lab/). - -## Intro to YOLO -[YOLO v5](https://github.com/ultralytics/YOLOv5) was created by Ultralytics, a U.S.-based particle physics and AI startup with over 6 years of expertise supporting government, academic and business clients. It is the latest version of the You Only Look Once framework. - - -## Using YOLO with Docker -With Docker, we can run YOLO entirely self-contained instead of downloading Darknet and other dependencies manually. - -### Docker Basics -Some Docker terminology: An image is the program that you develop in Docker. A container is an instance of an image that you actually run. A traditional programming analogy would be that an image is a class, and a container is an object of that class. Images are created using Dockerfiles and can be stored and pulled from online repositories. Boot up the Jetson, open a terminal, and list the current images on the Jetson: - -``` -docker images -``` -You'll likely see the arm64/hello-world image created in HW #1. "latest" is the default tag of an image if you don't specify it. The image ID is the best way to refer to a image. Now show all containers: - -``` -docker ps -a -``` -There should be a container with "hello-world" listed under the "IMAGE" column, based off the hello-world image we just saw. Containers also have IDs as well as names (different from an image's tag) that are randomly generated if not specified. - -### Creating a Docker Image with a Dockerfile -This Dockerfile creates a container that will run YOLO-v5. It will download the model based on the `MODEL` environment variable you will create (further down in this lab). Build the image: - -``` -docker build -t yolov5 -f Dockerfile.yolov5 . -``` -Wait for the process to finish, then list the Docker images to see if it worked. You should see a new image with the label "YOLO" under the repository column. - -### Running YOLO with a Container -Connect a USB webcam to the Jetson. First, enable X so that the container can output to a window. - -``` -xhost + -``` -Now create and run a container with YOLO, starting with regular YOLO first: - -``` -MODEL=yolov5x.pt -CAM=0 -docker run --privileged --runtime nvidia --rm -v /data:/data -e DISPLAY -v /tmp:/tmp -ti yolov5 python3 detect.py --source $CAM --weights $MODEL --conf 0.4 -``` - -A new window should open with live video feed from the webcam. The terminal window displays FPS and objects detected with percentage of how sure YOLO thinks it's right. What FPS do you get? Try running YOLO with the smaller model: - -``` -MODEL=yolov5s.pt -CAM=0 -docker run --privileged --runtime nvidia --rm -v /data:/data -e DISPLAY -v /tmp:/tmp -ti yolov5 python3 detect.py --source $CAM --weights $MODEL --conf 0.4 -``` -What is the FPS now? - -Was one of the models more accurate? - -These containers automatically delete themselves after you exit due to the "--rm" flag. Containers tend to pile up if you don't manage them well. If you want to look inside the running container, omit the command that automatically opens YOLO upon running the container and add the flag "-ti" to enter interactive mode: - -``` -docker run -e DISPLAY=$DISPLAY --rm --privileged -v /tmp:/tmp -ti yolov5 -``` -Now you can explore with regular terminal commands. - diff --git a/week02/Readme.md b/week02/Readme.md index 0adf80d..4167ff5 100644 --- a/week02/Readme.md +++ b/week02/Readme.md @@ -9,7 +9,6 @@ Introduction to Cloud Computing and Cloud AI—Defining the cloud. How clouds ar * http://www.ibm.com/cloud-computing/us/en/what-is-cloud-computing.html * https://azure.microsoft.com/en-us/overview/what-is-cloud-computing/ * https://cloud.google.com/learn/what-is-cloud-computing - * Types of clouds: https://www.globaldots.com/cloud-computing-types-of-cloud/ * Cloud service types: https://www.fingent.com/blog/cloud-service-models-saas-iaas-paas-choose-the-right-one-for-your-business @@ -27,6 +26,28 @@ Introduction to Cloud Computing and Cloud AI—Defining the cloud. How clouds ar * AWS API: https://docs.aws.amazon.com/ * AWS CLI: https://aws.amazon.com/cli/ +### Virtual Machines +* https://www.ibm.com/cloud/learn/hypervisors +* https://www.stratoscale.com/blog/hyperconvergence/cloud-101-what-is-a-hypervisor/ +* https://www.redhat.com/en/topics/virtualization/what-is-KVM + + +### Containers +* https://avatao.com/life-before-docker-and-beyond-a-brief-history-of-container-security/ +* https://docs.docker.com/get-started/overview/ +* https://mkdev.me/en/posts/the-tool-that-really-runs-your-containers-deep-dive-into-runc-and-oci-specifications +* https://www.docker.com/blog/what-is-containerd-runtime/ + + +### Installing Docker +* https://docs.docker.com/get-docker/ + + +### Kubernetes +* https://kubernetes.io/docs/concepts/ +* https://kubernetes.io/docs/tutorials/kubernetes-basics/ +* https://rancher.com/docs/k3s/latest/en/architecture/ +* https://thenewstack.io/how-k3s-portworx-and-calico-can-serve-as-a-foundation-of-cloud-native-edge-infrastructure/ ### Vision as a Service demo: * https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ diff --git a/week02/hw/Readme.md b/week02/hw/Readme.md index 05878e1..8ce7b63 100644 --- a/week02/hw/Readme.md +++ b/week02/hw/Readme.md @@ -1,4 +1,4 @@ -# HW02: Docker, Cloud ML Services(Sagemaker) and Pricing +# HW02: Docker, Cloud ML Services (Sagemaker) and Pricing *Complete Lab02 before starting HW02* @@ -213,27 +213,7 @@ You can take an AMI image snapshot before terminating the instance. -# PART2 - NOTE: Sagemaker cannot be run on the free tier ~~Setup and run Sagemaker Example~~ - -~~This HW further builds on using public cloud services with a primer on AWS Sagemaker. Sagemaker is a fully managed Machine Learning Service enabling -to easily build, train and deploy ML models with an integrated Jupyter Notebook instance.~~ - -- ~~Readup details on Sagemaker. https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-preprocess-data.html~~ - -- ~~Create Sagemaker notebook instance. https://docs.aws.amazon.com/sagemaker/latest/dg/gs-console.html~~ - -- ~~Create a jupyter notebook and save it. https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-prepare.html~~ - -- ~~Run the end to end Example https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/1P_kmeans_highlevel/kmeans_mnist.ipynb -(This incl. downloading MNIST dataset to your account's default S3 Object storage)~~ - -- ~~Once complete, Login to your account and check the resources the example created~~ - -- ~~Cleanup the environment by deleting the deployed endpoint, Notebook instance and S3 to not incur AWS charges~~ - - - -# PART3 - Pricing +# PART2 - Pricing #### Spot pricing @@ -289,7 +269,7 @@ Please update the limit for VCPU on my account to be 32 VCPUs for the g4dn.2xlar -# PART4 - Turn in +# Turn in ### Submit text file on class portal homework submission page with answers to the following Questions diff --git a/week03/README.md b/week03/README.md index bfcb567..7ea6cdb 100644 --- a/week03/README.md +++ b/week03/README.md @@ -1,42 +1,29 @@ # Lecture 3 -Virtual machines, containers, orchestration and Kubernetes, edge, and cloud native. +Object detection, COCO dataset, DETR model, ML pipelines, edge, and cloud native. ## Reading -### Virtual Machines +### Object detection -https://www.ibm.com/cloud/learn/hypervisors +https://huggingface.co/docs/api-inference/detailed_parameters#object-detection-task -https://www.stratoscale.com/blog/hyperconvergence/cloud-101-what-is-a-hypervisor/ +https://cocodataset.org/#download -https://www.redhat.com/en/topics/virtualization/what-is-KVM +https://paperswithcode.com/task/object-detection +### DETR model -### Containers +https://huggingface.co/runwayml/stable-diffusion-v1-5 -https://avatao.com/life-before-docker-and-beyond-a-brief-history-of-container-security/ +https://huggingface.co/docs/transformers/model_doc/detr -https://docs.docker.com/get-started/overview/ +https://arxiv.org/abs/2005.12872 -https://mkdev.me/en/posts/the-tool-that-really-runs-your-containers-deep-dive-into-runc-and-oci-specifications +### Building ML pipelines -https://www.docker.com/blog/what-is-containerd-runtime/ +https://aws.amazon.com/blogs/machine-learning/architect-and-build-the-full-machine-learning-lifecycle-with-amazon-sagemaker/ - -### Installing Docker - -https://docs.docker.com/get-docker/ - - -### Kubernetes - -https://kubernetes.io/docs/concepts/ - -https://kubernetes.io/docs/tutorials/kubernetes-basics/ - -https://rancher.com/docs/k3s/latest/en/architecture/ - -https://thenewstack.io/how-k3s-portworx-and-calico-can-serve-as-a-foundation-of-cloud-native-edge-infrastructure/ +https://azure.github.io/ACE_Azure_ML/slides/AML_service.pptx ### Miscellaneous diff --git a/week03/demo/README.md b/week03/demo/README.md deleted file mode 100644 index 434eadb..0000000 --- a/week03/demo/README.md +++ /dev/null @@ -1,587 +0,0 @@ -# Introduction to Docker Examples - -This file provides a basic introduction to using Docker, with specific examples to using on the Nvidia XAVIER NX. - -These examples assume the use of one or more shells open. - -## Part 1: Installation -Docker is installed by default as part of the Nvidia Jetpack Install. For other platforms, see the Docker installation instructions. See https://docs.docker.com/engine/install/ for installation instructions on non-NX platforms, e.g. macOS x86_64, Windows, Linux (CentOS, Ubuntu, etc.) Unless noted, these examples will run on any docker example. - -Note, at the time of writing, Docker Desktop fo Apple M1 is still a tech preview. - -### Optional for the NX and Linux installations. -By default, Docker is owned by and runs as the user root. This requires commands to be executed with sudo. If you don't want to use sudo, a group may be used instead. This group group grants privileges equivalent to the root user. For details on how this impacts security in your system, see Docker Daemon Attack Surface (https://docs.docker.com/engine/security/#docker-daemon-attack-surface) -. The examples will assume this has been done. If you do not do this, you'll need to prefix the docker commands with `sudo`. - -1. Create the group docker. Note, this group may already exist. -``` -sudo groupadd docker -``` -2. Add your user to the docker group. -``` -sudo usermod -aG docker $USER -``` -3. Log out and log back in so that your group membership is re-evaluated. -If testing on a virtual machine, it may be necessary to restart the virtual machine for changes to take effect. - -## Part 2: Registries -An image registry provides a means of sharing images. Registries can be hosted, e.g. DockerHub or NVIDIA's NCG, or may be self hosted. You'll want to sign up for a free account with DockerHub (https://hub.docker.com). This will allow you to easily share images. - -See https://docs.docker.com/docker-hub/ for the steps. Once your account is created, login via the command -``` -docker login -``` -and follow the prompts. - -## Part 3: Hello World - -Note, The following will run any docker instance. - -This example will run a simple hello world container. You'll start by runninh the command `docker run hello-world`. -If things are correctly installed, you'll see an output similar to: -``` -Unable to find image 'hello-world:latest' locally -latest: Pulling from library/hello-world -256ab8fe8778: Pull complete -Digest: sha256:1a523af650137b8accdaed439c17d684df61ee4d74feac151b5b337bd29e7eec -Status: Downloaded newer image for hello-world:latest - -Hello from Docker! -This message shows that your installation appears to be working correctly. - -To generate this message, Docker took the following steps: - 1. The Docker client contacted the Docker daemon. - 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. - (arm64v8) - 3. The Docker daemon created a new container from that image which runs the - executable that produces the output you are currently reading. - 4. The Docker daemon streamed that output to the Docker client, which sent it - to your terminal. - -To try something more ambitious, you can run an Ubuntu container with: - $ docker run -it ubuntu bash - -Share images, automate workflows, and more with a free Docker ID: - https://hub.docker.com/ - -For more examples and ideas, visit: - https://docs.docker.com/get-started/ -``` - -This command "runs" a container based on the specified image, in this case, an image named hello-world. -When the command was executed, Docker first checked if the image was availalbe locally, and as it was not, it downloaded it for us. As we didn't specify the tag, it automatically download the "latest" one. Next, it started the container, which printed out the message, then exited. - -We can run the command `docker images` to see that we have the hello-world image locally. The output will look similar to: -``` -REPOSITORY TAG IMAGE ID CREATED SIZE -hello-world latest a29f45ccde2a 12 months ago 9.14kB -``` - -The command `docker ps` is used to list containers. The default command lists only running containers while the -a option is neede to list all containers. - -`docker ps`: -``` -CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -``` -vs -`docker ps -a`: -``` -CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -697c6c524995 hello-world "/hello" 10 minutes ago Exited (0) 10 minutes ago tender_galois -``` - -A stopped container may be deleted with the command `docker rm `. Delete your container and verify it has been removed. - -When running a container, we can also give it a name with the flag `--name `. Note, docker provents containers from having duplicate names. Running hello-world again with this flag: -``` -docker run --name helloWorld hello-world -``` -Note, the image is not downloaded again, rather the local "cached" image is used. - -Run `docker ps -a` again and you should now see something similar to: -``` -CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -1ee8c4789e21 hello-world "/hello" About a minute ago Exited (0) About a minute ago helloWorld -``` - -This container can now be deleted with the command `docker rm helloWorld`. - -If you don't care to save the container when it exits, docker can automatically delete the container with the option --rm. -Run the command `docker run --name helloWorld --rm hello-world` and then when it exists, rerun `docker ps -a`. You'll find that the container has been removed. - -When done, you can remove the image with the command `docker rmi hello-world:latest`. An image may only be deleted when there are no containers using that image. - -### Docker command recap -| Command | Description | -| --- | --------- | -| docker ps | Lists all running containers | -| docker ps -a | Lists all containers | -| docker images | Lista all local images | -| docker run | Run an image as a container | -| docker run --name | Run an image as a container, using the specified name for the container. | -| docker run --rm | Run an image as a conatiner and automatically delete it when it exits. | - -## Part 4: Interacting with a container - -You'll now see a slightly more interesting example using the Ubuntu base image. You'll start by running the command `docker pull ubuntu:latest`. This will explictly download the image for you, making sure you have the most up to date version. After the download is complete, run the command `docker run --name ubuntu --rm -it ubuntu bash`. The `-it` option enable interactive mode and allocates a pseudo-TTY. This allows us to interact with the container process. The `bash` command tells docker to run the bash shell as the container's process. This command will give you a shell that looks similar to -``` -root@eb3d250d4de8:/# -``` -Run some commands. apt-get update && apt-get install vim, etc. This will work as expected. - -As you know, Docker is processed based. On Linux systems, we can see this from the host by using the ps command. Note, non-Linux hosts such as macOS run Docker in a Virutal Machine and the host OS cannot see the process. - -I've installed VI (apt-get install -y vim) and am editing a file named helloWorld.txt via the command `vi helloWorld.txt`. - -``` -ps -ef | grep vi -root 6648 5776 0 07:55 pts/0 00:00:00 vi helloWorld.txt -``` - -As expected, this is just a process! - -Exit out of your ubuntu container. - -Let's get 2 containers talking to each other!. In one shell, run the following - -``` -docker run --name web --hostname web --rm -it ubuntu bash -``` - -And in a second - -``` -docker run --name db --hostname db --rm -it ubuntu bash -``` - -We are using the hostname option to force docker to set the hostname as something recognizable vs the container id. - -In the `web` container, let's install ping: `apt-get update && apt-get install iputils-ping -y` - -Once installed, let's try to ping `db` with the command `ping db`. You'll get an error back that says: `ping: db: No address associated with hostname`. We need to find the container's IP Address; in a third shell, run the command `docker inspect db | grep IPAddress`. You'll get back a field `IPAddress`; take note of the value. For example, I get `"IPAddress": "172.17.0.3"`. Now in web, ping your db's IP Address and you should get a response similar to: -``` -root@web:/# ping 172.17.0.3 -PING 172.17.0.3 (172.17.0.3) 56(84) bytes of data. -64 bytes from 172.17.0.3: icmp_seq=1 ttl=64 time=0.224 ms -``` -Exit out of your 2 containers. - -By default, containers can talk by IP Address, but not by name. While this works, it is brittle and not portable. There use to be a link option on run, but this was replace by user defined networks. -These user defined networks provide: -- DNS resultion between containers -- Better isolation - only containes attached may communicate with each other. -- Containers can be attached and detached on the fly - -You'll create a network with the command `docker network create demo` to create a network named `demo`. You can verify the network with the command `docker network ls`. We'll recreate our 2 containers, but this time using our new network. - -``` -In shell one: -docker run --name web --hostname web --network demo --rm -it ubuntu bash - -In shell two: -docker run --name db --hostname db --network demo --rm -it ubuntu bash - -``` -In web, install ping again and run the command `ping db`. This time, you'll get a resopnse! - -``` -root@web:/# ping db -PING db (172.19.0.3) 56(84) bytes of data. -64 bytes from db.demo (172.19.0.3): icmp_seq=1 ttl=64 time=0.483 ms -64 bytes from db.demo (172.19.0.3): icmp_seq=2 ttl=64 time=0.201 ms -``` - -Close out your containers and then delete your network with the command `docker network rm demo`. - - -### Docker command recap -| Command | Description | -| --- | --------- | -| docker ps | Lists all running containers | -| docker ps -a | Lists all containers | -| docker images | Lista all local images | -| docker run | Run an image as a container | -| docker run --name | Run an image as a container, using the specified name for the container. | -| docker run --rm | Run an image as a conatiner and automatically delete it when it exits. | -| docker network create | Create a user defined network | -| docker network ls | List networks | -| docker network rm | Delete a network | -| docker run -it ... | Enable the abiliy to interact with a container via stdin and stdout | -| docker run --hostname ... | Set the hostname of the container | -| docker run --network ... | Set the container's network | -| docker inspect | Return low-level information on Docker objects | - -## Part 5: More Networking and Persistent files -Note, this is just an example. For production applications, the files used here should be baked into the image. - -In this example, we'll be working with Nginx, a robost HTTP server. You'll launch your instance with the following command `docker run -d --name web --hostname web --rm nginx`. The `-d` option runs the container in detached mode. This means the container is running in the background! - -``` -rdejana@nx:~$ docker run -d --name web --hostname web --rm nginx -Unable to find image 'nginx:latest' locally -latest: Pulling from library/nginx -c9648d7fcbb6: Pull complete -af2653e2da79: Pull complete -1af64ee707c7: Pull complete -3bdc08a2d3ea: Pull complete -fed23bd0d00d: Pull complete -Digest: sha256:4cf620a5c81390ee209398ecc18e5fb9dd0f5155cd82adcbae532fec94006fb9 -Status: Downloaded newer image for nginx:latest -2d5a5aa602e5e3c6198ad3ae9733d33255c486fef1ef7a29e56bc42d6cadd9e1 -rdejana@nx:~$ -``` - -You'll use the `exec` command to access the container. Run `docker exec -ti web bash`. This will attach us to the container and start a bash process for you to interact with. -Once in the command, verify that Nginx is running by curling the server: `curl http://localhost` and you'll see: -``` -root@web:/# curl http://localhost - - - -Welcome to nginx! - - - -

Welcome to nginx!

-

If you see this page, the nginx web server is successfully installed and -working. Further configuration is required.

- -

For online documentation and support please refer to -nginx.org.
-Commercial support is available at -nginx.com.

- -

Thank you for using nginx.

- - -``` - -Now how to access this container from OUTSIDE the container? Exit out of the shell and stop the container with `docker stop web`. Recreate the container with the command: -`docker run -d --name web --hostname web -p 8080:80 --rm nginx`. The `-p #:#` maps a port on the host, 8080 in this case, to a port in the container, in this case 80. From a web browser, access you NX, by name or IP on port 8080, eg. http://nx:8080 and you should get back your web page. - -You can run the command `docker logs web` to see container's output. -``` -docker logs web -/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration -/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ -/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh -10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf -10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf -/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh -/docker-entrypoint.sh: Configuration complete; ready for start up -192.168.1.199 - - [05/Jan/2021:15:51:26 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.2 Safari/605.1.15" "-" -``` - -You'll now customize the HTML page. By default, the HTML files are located in the directory `/usr/share/nginx/html`. You'll want to install VI (or another editor) (apt-get update && apt-get install -y vim). Change to the HTML dir, and edit the fine index.html replacing the text with: -``` - - - -Hello World - - -

Hello World from Docker and Nginx!

- - -``` -Save the file and access the web page again. Now stop your container and start it again. Notice that your changes are now gone. This is because, when the contaienr is deleted, all of your changes are deleted. We'll address this by creating a persistent file system. On your host, create a directory were you'll want to store HTML. Change to that directory and create the index.html file with the content you used before. Now launch your container with the following: -``` -docker run -d --name web --hostname web -p 8080:80 --rm -v :/usr/share/nginx/html nginx -``` -replacing `` with your actual directory. Access your page again and you'll see your changes! You've created a container using a bind mount. This is used to share content from your host to your container. Another option is to use a volume. A volume is a more power and portable solution, but one in which the "where" abastraced out. If you are intersted in volumes, see the Docker documentation. - -Stop your container. - -## Part 6. Building and sharing your own custom image. -Using exiting containers is great, but the Docker also allows you to create and share your own images. Building your own image starts with a Dockerfile. In this example, we'll use the following Dockerfile -``` -# Using ubuntu 18.04 as base image -FROM ubuntu:18.04 -# update the base image -RUN apt-get update && apt-get -y update -# install -RUN apt-get install python3-pip python3-dev build-essential nodejs -y -# make python3 -> python -RUN ln -s /usr/bin/python3 /usr/local/bin/python -# update pip -RUN pip3 install --upgrade pip -# install jupyter and lab -RUN pip3 install jupyter -RUN pip3 install jupyterlab -# set our workdir -WORKDIR /src/notebooks -COPY notebooks/simple.ipynb ./ -# Setup which command to run... -# This runs jup notebook -CMD ["jupyter", "notebook", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] -# This runs jup lab -#CMD ["jupyter", "lab", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] -``` -This file is located in this repository under the directory `docker/docker`. - -The `FROM` tells Docker which base image we are using. In this case, you are using the ubuntu 18.04 base image. - -`RUN` runs a command in the container. The run commands here are split for readablity; best practive would be combine them where it makes sense. Here we are running a set of commands that update the base image, install python3 and install jupyter. - -`WORKDIR` sets up where the container is running from. The directory is created if it doesn't already exist. - -`COPY` enables the copying of files from host to the container. - -`CMD` is the command that the container will run at start up. In this example, it'll start up a notebook. - -`#` is a comment. Comments are not executed by Docker. - -To build an image, you'll use the build command. You'll want to clone this repo and change to the docker directory. - -Run `docker build -t myimage .`. This will create an image named myimage and use the default Dockerfile. You'll see output like: -``` - docker build --no-cache -t myimage . -Sending build context to Docker daemon 4.608kB -Step 1/10 : FROM ubuntu:18.04 - ---> 2c047404e52d -Step 2/10 : RUN apt-get update && apt-get -y update - ---> Running in 076f304f5fa0 -Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB] -Get:2 http://archive.ubuntu.com/ubuntu bionic InRelease [242 kB] -Get:3 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [247 kB] -Get:4 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [1845 kB] -Get:5 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [14.9 kB] -Get:6 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1376 kB] -Get:7 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB] -Get:8 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB] -Get:9 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages [1344 kB] -Get:10 http://archive.ubuntu.com/ubuntu bionic/multiverse amd64 Packages [186 kB] -Get:11 http://archive.ubuntu.com/ubuntu bionic/restricted amd64 Packages [13.5 kB] -Get:12 http://archive.ubuntu.com/ubuntu bionic/universe amd64 Packages [11.3 MB] -Get:13 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [2272 kB] -... -Removing intermediate container 076f304f5fa0 - ---> c60341f64168 -Step 3/10 : RUN apt-get install python3-pip python3-dev build-essential nodejs -y - ---> Running in 7c6214716125 -Reading package lists... -Building dependency tree... -Reading state information... -The following additional packages will be installed: - binutils binutils-common binutils-x86-64-linux-gnu ca-certificates cpp cpp-7 - dbus dh-python dirmngr dpkg-dev fakeroot file g++ g++-7 gcc gcc-7 gcc-7-base - gir1.2-glib-2.0 gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client - gpg-wks-server gpgconf gpgsm libalgorithm-diff-perl - libalgorithm-diff-xs-perl libalgorithm-merge-perl libapparmor1 libasan4 - libasn1-8-heimdal libassuan0 libatomic1 libbinutils libc-ares2 libc-dev-bin - -info_1.9-2_amd64.deb ... - -... -Step 8/10 : WORKDIR /src/notebooks - ---> Running in 8bac2a0a6c5c -Removing intermediate container 8bac2a0a6c5c - ---> d6e882fd43c4 -Step 9/10 : COPY notebooks/simple.ipynb ./ - ---> df233fcc6b7c -Step 10/10 : CMD ["jupyter", "notebook", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] - ---> Running in ea673ebdbbed -Removing intermediate container ea673ebdbbed - ---> d28014cd5559 -Successfully built d28014cd5559 -Successfully tagged myimage:latest -skywalker:docker rdejana$ -``` - -Go ahead and launch the image with the command `docker run -ti -p 8888:8888 myimage`. You'll see some info with the token value displayed to stdout. Use that to login into your notebook. You can them open simple.ipynb and run it. - -You'll now push this image to DockerHub. Run the command `docker tag myimage /myjupyter`, replacing `` with your actual value. For me, it would be `docker tag myimage rdejana/myjupyter`. You'll them push the image into DockerHub; `docker push rdejana/myjupyter`. -``` -skywalker:demo rdejana$ docker tag myimage rdejana/myjupyter -skywalker:demo rdejana$ docker push rdejana/myjupyter -Using default tag: latest -The push refers to repository [docker.io/rdejana/myjupyter] -54601f934e1c: Mounted from rdejana/mytestimagefordocker -4076f41a98fa: Mounted from rdejana/mytestimagefordocker -3dc3b21f8df2: Mounted from rdejana/mytestimagefordocker -62db2e220080: Mounted from rdejana/mytestimagefordocker -aacaab36a3af: Mounted from rdejana/mytestimagefordocker -927cc003fdc7: Mounted from rdejana/mytestimagefordocker -a3e6098f0a63: Mounted from rdejana/mytestimagefordocker -317dbebecdcd: Mounted from rdejana/mytestimagefordocker -fe6d8881187d: Mounted from rdejana/mytestimagefordocker -23135df75b44: Mounted from rdejana/mytestimagefordocker -b43408d5f11b: Mounted from rdejana/mytestimagefordocker -latest: digest: sha256:eed27ee55faf1e9fe0a85eadd0e3146fa87876d8e8e25e3fd7c4fddd497a0768 size: 2624 -``` -Navigate back to DockerHub (https://hub.docker.com) and verify that you can see your image. Now delete your local image, `docker rmi myimage` and now pull from DockerHub, `docker pull /myjupyter`. You've successfully shared an image! - -## Part 7: CPU Architectures -From your NX, run the command `docker pull rdejana/ubuntu`. Now start a container with the command `docker run -ti --rm rdejana/ubuntu bash`. Rather than a prompt, you'll get this error message: -``` -standard_init_linux.go:211: exec user process caused "exec format error" -``` - -If docker inspect is used to find the archicture, you'll see that the image is an amd64 (x86_64). Running `docker info` will show that your NX uses an aarch64 architecture. While many DockerHub images provide multiple architectures, you'll need to make sure the image you want to use is supported on your NX. - - -# GPU - -## Part 1: Configure runtime -These must be run on the NX. You'll want to run with a monitor attached. Nvidia provides a runtime to enable Docker to use GPUs. You can verify that the runtime is installed by running the command -`docker info | grep nvidia`. You should see `Runtimes: nvidia runc`. We'll be setting the runtime to be nvidia by default. Edit the file `/etc/docker/daemon.json`, e.g. sudo `vi /etc/docker/daemon.json`, adding/setting the `default-runtime` to `nvidia`. -``` -{ - "runtimes": { - "nvidia": { - "path": "nvidia-container-runtime", - "runtimeArgs": [] - } - }, - "default-runtime": "nvidia" -} -``` - -Reboot your NX and login when reboot is completed. -From a shell, run the following: -``` -# Allow containers to communicate with Xorg -$ sudo xhost +si:localuser:root -$ sudo docker run --runtime nvidia --network host -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.3.1 - -root@nano:/# apt-get update && apt-get install -y --no-install-recommends make g++ -root@nano:/# cp -r /usr/local/cuda/samples /tmp -root@nano:/# cd /tmp/samples/5_Simulations/nbody -root@nano:/# make -root@nano:/# ./nbody -``` - -This will display a N-body simulation, running in a container and displaying on your UI. - -## Part 2: TensorFlow -You'll start by pulling the offical Jetson TensorFlow image with the command `docker pull nvcr.io/nvidia/l4t-tensorflow:r32.4.4-tf2.3-py3`. Details on this image and other images can be found at Nvidia NGC registry, https://ngc.nvidia.com/catalog/containers/nvidia:l4t-tensorflow. Now start the container `docker run -it --rm --network host nvcr.io/nvidia/l4t-tensorflow:r32.4.4-tf1.15-py3`. - -Once you have the prompt, start python3. From the python3 prompt, enter the following - ``` - >>> import tensorflow as tf - >>> print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) - Num GPUs Available: 1 - ``` -You've now used your Nvidia GPU with Tensor from a container! Other images form Nvidia includ PyTorch with GPU support along with images with scipy and Panadas pre-installed. - - -# NX and Kuberenetes - -## Part 1: Install and verify Kubernetes -In this demo, we'll expore installing Kubernetes on Jetson Xavier NX and then deploying a simple application. -We'll be using an approach based on https://thenewstack.io/tutorial-deploying-tensorflow-models-at-the-edge-with-nvidia-jetson-nano-and-k3s/ and using a version of Kubernetes called K3s, a distribution focused a lightweight Kubernetes designed for the Edge. - -You'll first want to make sure that the default runtime for Docker is set to nvidia. Confirm that the file `/etc/docker/daemon.json` looks like: -``` -{ - "runtimes": { - "nvidia": { - "path": "nvidia-container-runtime", - "runtimeArgs": [] - } - }, - - "default-runtime": "nvidia" -} - -``` - -If changes are needed, you'll need either reboot your NX or restart the docker service. -``` -sudo systemctl restart docker -``` - - -To install K3s, run the following: -``` -mkdir $HOME/.kube/ -curl -sfL https://get.k3s.io | sh -s - --docker --write-kubeconfig-mode 644 --write-kubeconfig $HOME/.kube/config -``` - -This installs k3s and has it use a Docker instead of containerd. - -After a few minutes, you'll have Kubernetes up and running. -``` -kubectl get nodes -NAME STATUS ROLES AGE VERSION -nx Ready control-plane,master 27s v1.20.0+k3s2 -``` - -You will create a simple deployment of a Nginx web server. Create a file named nginx.yaml with the following content: -``` -apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2 -kind: Deployment -metadata: - name: nginx-deployment -spec: - selector: - matchLabels: - app: nginx - replicas: 1 # tells deployment to run 1 pod matching the template - template: - metadata: - labels: - app: nginx - spec: - containers: - - name: nginx - image: nginx:1.14.2 - ports: - - containerPort: 80 - -``` - -This file describes what we want running. A single replica of a Nginx container, which is listening on port 80. - -Now run the command `kubectl apply -f nginx.yaml`. This will create the deployment. Run the command `kubectl get pods` to watch the containers start up. Once they are running, run the command `kubectl expose deployment nginx-deployment --port=80 --type=NodePort` to create a service, or a means of accessing, for you deployment. A NodePort service means we are using a "random" port on the the host. Run the command `kubectl get service nginx-deployment` to list the service. Notice the PORT section, and look for `80:####` where `####` is the port your nginx container is exposed on. - -To clean up, run `kubectl delete service nginx-deployment` and `kubectl delete deployment nginx-deployment`. - -### Stopping and Starting Kubernetes -Kubernetes is installed as a systemd service and is configured to start automatically. You can disable this with the following command: -``` -sudo systemctl disable k3s -``` -The service can be started with the command: -``` -sudo systemctl start k3s -``` -and stopped with: -``` -sudo systemctl stop k3s -``` -If k3s doesn't stop cleanling, rebooting will be needed. - - -## Part 2: Kubernetes and GPU. -Start Kubernetes if needed. On your NX, create a file named tf.yaml with the following content: -``` -apiVersion: v1 -kind: Pod -metadata: - name: tensorflow -spec: - containers: - - name: tf - image: nvcr.io/nvidia/l4t-tensorflow:r32.4.4-tf2.3-py3 - command: [ "/bin/bash", "-c", "--" ] - args: [ "while true; do sleep 30; done;" ] -``` -This will create a simple pod running the the bash shell. - -Run `kubectl apply -f tf2.yaml` to create the pod. Now run `kubectl get pods` and verify that the pod is running. This may take some time if the image needs to be downloaded. You'll now "shell" into the container with the command `kubectl exec -it tensorflow -- python3`. - -From the prompt, enter the following: - ``` - >>> import tensorflow as tf - >>> print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) - Num GPUs Available: 1 - ``` - - Exit python and delete your pod with the command `kubectl delete pod tensorflow` You may now stop Kubernetes or reboot as needed. - diff --git a/week03/demo/docker/Dockerfile b/week03/demo/docker/Dockerfile deleted file mode 100644 index c8cd461..0000000 --- a/week03/demo/docker/Dockerfile +++ /dev/null @@ -1,22 +0,0 @@ -# Using ubuntu 18.04 as base image -FROM ubuntu:18.04 -# update the base image -RUN apt-get update && apt-get -y update -# install -RUN apt-get install python3-pip python3-dev build-essential nodejs -y -# make python3 -> python -RUN ln -s /usr/bin/python3 /usr/local/bin/python -# update pip -RUN pip3 install --upgrade pip -# install jupyter and lab -RUN pip3 install jupyter -RUN pip3 install jupyterlab -# set our workdir -WORKDIR /src/notebooks -# Copy our notebook over -COPY notebooks/simple.ipynb ./ -# Setup which command to run... -# This runs jup notebook -CMD ["jupyter", "notebook", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] -# This runs jup lab -#CMD ["jupyter", "lab", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] diff --git a/week03/demo/docker/notebooks/simple.ipynb b/week03/demo/docker/notebooks/simple.ipynb deleted file mode 100644 index 79c32a1..0000000 --- a/week03/demo/docker/notebooks/simple.ipynb +++ /dev/null @@ -1,34 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(\"Hello World!\")" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.9" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/week03/hw/Pipeline.PNG b/week03/hw/Pipeline.PNG new file mode 100644 index 0000000..1636bbe Binary files /dev/null and b/week03/hw/Pipeline.PNG differ diff --git a/week03/hw/Pipeline_raw.pptx b/week03/hw/Pipeline_raw.pptx new file mode 100644 index 0000000..114409a Binary files /dev/null and b/week03/hw/Pipeline_raw.pptx differ diff --git a/week03/hw/README.md b/week03/hw/README.md index 55a1959..9635984 100644 --- a/week03/hw/README.md +++ b/week03/hw/README.md @@ -1,90 +1,74 @@ -# Homework 3 - Containers, Kubernetes, and IoT/Edge +# Homework 3 - ML pipelines and IoT/Edge -## Please note that this homework is graded - -### Note: 01/10/2022 -If you have upgraded (apt upgrade) your Jetson's installation and are getting an error similar to the following when running docker: -``` -docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: error adding seccomp filter rule for syscall clone3: permission denied: unknown. -``` -See https://forums.developer.nvidia.com/t/docker-isnt-working-after-apt-upgrade/195213/3. There you'll find the instructions to downgrade and pin the docker version at one that works. As an alternative, you may add the following option to your run command, `-security-opt seccomp=unconfined`, e.g. -``` -docker run -it --rm --runtime nvidia --security-opt seccomp=unconfined --network host nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3 -``` - -### Note: If you do not have a Jetson device. -If you currently do not have a Jetson device, this homework can be done using a virtual machine running on your workstatation. See lab3's labNoJetson.md for details. In the instructions, replace Jetson with your local VM. As OpenCV is most likely not installed in your VM, you may find the cascade file at https://github.com/opencv/opencv/tree/master/data/haarcascades. + :warning: **Please note that this homework is graded** ## Instructions -The objective of this homework is to buld a lightweight containerized application pipeline with components running on the edge, your Jetson, and in the the cloud, a VM in AWS. The application should be writen in a modular/cloud native way so that it could be run on any edge devce or hub and any cloud VM, or even another type of device connected to some type of storage instead of cloud hosted VM. In addition, the edge application should be deployed using Kubernetes (K3s for example) on your Jetson and the cloud VM components should run using Docker. - -You will build an application that is able to capture faces in a video stream coming from the edge, then transmit them to the cloud via MQTT and saving these faces for "long term storage". For the face detector component, we ask that you use OpenCV and write an application that scans the video frames coming from the connected USB camera for faces. When one or more faces are detected in the frame, the application should cut them out of the frame and send via a binary message each. Your edge applicaiton should use MQTT as your messaging fabric. As you'll be treating your Jetson as hub, you'll need a broker installed on the Jetson, and that your face detector sends its messages to this broker first. You'll then need another component that receives these messages from the local broker, and sends them to the cloud [MQTT broker]. Because edge applications often use messages to communicate with other local components, you'll need another local listener that just outputs to its log (standard out) that it has received a face message. - -In the cloud, you need to provision a lightweight virtual machine (1-2 CPUs and 2-4 G of RAM should suffice) and run an MQTT broker in a Docker container. As discussed above, the faces will need to be sent here as binary messages. You'll need a second component here that receives the messages and saves the images to to the s3 Object storage, ideally via [boto](https://pypi.org/project/boto) (e.g. see a code sample here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html) . - -Please don't be intimidated by this homework as it is mostly a learning experience on the building blocks. The concept of the Internet of Things deals with a large number of devices that communicate largely through messaging. Here, we have just one device and one sensor- the camera. But, we could add a bunch more sensors like microphones, GPS, proximity sensors, lidars, etc. - +The objective of this homework is to buld a lightweight containerized application pipeline with components running on the edge (emulated by capturing a video/set of videos) and in the the cloud, a VM in the cloud (AWS). The application should be writen in a modular/cloud native way so that it could be run on any edge devce or hub and any cloud VM, or even another type of device connected to some type of storage instead of cloud hosted VM. -On the Jetson, we request that you use [Alpine Linux](https://alpinelinux.org/) as the base OS for your MQTT containers as it is frugal in terms of storage. You will need to use Ubuntu as the base for your OpenCV container. Please recall that Jetson devices and Raspberry Pis are both based on the [ARM v8 architecture](https://en.wikichip.org/wiki/arm/armv8) as opposed to Intel x86/64 architecture. +Firstly, you capture videos with your smartphone / tablet / digital camera. We recommend that you do something that is connected to your hobbies or a problem of interest. You can be creative - e.g., you can put a camera next to a bird feeder in your backyard. You can record fish or turtle in your aquarium. You can record yourself doing fitness exercises. You can record what your dog is doing while you're not at home. You can record what's happening in your fridge. You can record a sports game with your friends (but don't forget to ask for their permission 😉). You can record your collection of Star Track figurines, etc. This video stream is your data in this assignment. -For details on using MQTT with Apline and Ubuntu, refer to Lab 3. +Next, you will build an application that is able to process this video stream coming from the edge, use an object detection model to identify frames with specific objects of interest, then transmit them to the cloud and save for "long term storage". For the object detector component, we ask that you use DETR neural netwrok and write an application that scans the video frames for objects that are relevant to your task. Depending on the domain that you chose, it may be a dog, a bird, a squirrel, a human, etc. When one or more objects of interest are detected in the frame, the application should cut them out of the frame and send via a binary message each. -[OpenCV](https://opencv.org/) is THE library for computer vision. At the moment it has fallen behind the Deep Learning curve, but it could catch up at any moment. For traditional, non-DL image processing, it is unmatched. +In the cloud, you need to provision a lightweight virtual machine (1-2 CPUs and 2-4 G of RAM should suffice). As discussed above, the images will need to be sent here as binary messages. You'll need a second component here that receives the messages and saves the images to to the s3 Object storage, ideally via [boto](https://pypi.org/project/boto) (e.g. see a code sample here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html) . +Please don't be intimidated by this homework as it is mostly a learning experience on the building blocks. The concept of the Internet of Things deals with a large number of devices that communicate largely through messaging. Here, we have just one device and one sensor- the camera. But, we could add a bunch more sensors like microphones, GPS, proximity sensors, lidars, etc. -Refer to Lab 3 for how to get started with OpenCV and some addtional hints for getting started with OpenCV in a container are [here](https://github.com/rdejana/w251-hints/tree/master/hw3), if you need them. +[DETR-ResNet-50](https://huggingface.co/facebook/detr-resnet-50) is [the most downloadable](https://huggingface.co/models?pipeline_tag=object-detection&sort=downloads) ML model for object detection task on Hugging Face 🤗 at the moment, which is one of the reasons why we want you to get hands-on experience with it. More on reasons later... -### Facial detection with OpenCV -We suggest that you use a simple pre-trained frontal face HAAR Cascade Classifier [documented here](https://docs.opencv.org/3.4.1/d7/d8b/tutorial_py_face_detection.html). There is no need to detect eyes,just the face. Notice how simple it is to use: -``` -import numpy as np -import cv2 as cv -face_cascade = cv.CascadeClassifier('haarcascade_frontalface_default.xml') - -# gray here is the gray frame you will be getting from a camera -gray = cv.cvtColor(gray, cv.COLOR_BGR2GRAY) -faces = face_cascade.detectMultiScale(gray, 1.3, 5) -for (x,y,w,h) in faces: - # your logic goes here; for instance - # cut out face from the frame.. - # rc,png = cv2.imencode('.png', face) - # msg = png.tobytes() - # ... +### Object detection with DETR +We suggest that you use a simple pre-trained frontal version of the model [documented here](https://huggingface.co/docs/transformers/model_doc/detr). Notice how simple it is to use, despite being quite complex inside: ``` - +from transformers import DetrFeatureExtractor, DetrForObjectDetection +import torch +from PIL import Image +import requests + +url = "http://images.cocodataset.org/val2017/000000039769.jpg" +image = Image.open(requests.get(url, stream=True).raw) + +feature_extractor = DetrFeatureExtractor.from_pretrained("facebook/detr-resnet-50") +model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50") + +inputs = feature_extractor(images=image, return_tensors="pt") +outputs = model(**inputs) + +# convert outputs (bounding boxes and class logits) to COCO API +target_sizes = torch.tensor([image.size[::-1]]) +results = feature_extractor.post_process(outputs, target_sizes=target_sizes)[0] + +for score, label, box in zip(results["scores"], results["labels"], results["boxes"]): + box = [round(i, 2) for i in box.tolist()] + # let's only keep detections with score > 0.9 + if score > 0.9: + print( + f"Detected {model.config.id2label[label.item()]} with confidence " + f"{round(score.item(), 3)} at location {box}" + ) ``` -Note, you can find the OpenCV cascade files on your Jetson in the directory /usr/share/opencv4/haarcascades -``` - -### Linking containers -On the Jetson, your containers should communicate via Kubernetes services, see Lab 3 for details. On the cloud side, you should use a user defined network to enable your containers to easily communicate. Please review the [docker networking tutorial](https://docs.docker.com/network/network-tutorial-standalone/#use-user-defined-bridge-networks). The idea is that you will need to create a local bridge network and then the containers you will create will join it. +You can learn more details about DETR in [this original article](https://arxiv.org/abs/2005.12872). ### Overall architecture / flow -Your overall application flow / architecture should be something like: ![this](hw3.png). - -### Bonus Points -You can recieve an extra 10 bonus points for using Kubernetes on the cloud side rather than Docker. - +Your overall application flow / architecture should be something like: ![this](https://github.com/alsavelv/v3/blob/MIDS-scaling-up/v4/week03/hw/Pipeline.PNG). ### Hints -- Using a USB device from Kubernetes requires a privileged security context. If you'd like your container to display your camera's images, you'll need to enable host networking and set the DISPLAY env variable. - To make storing in Object Store easier, look at https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-configure-bucket.html +- Depending on the nature of your video stream, optimal video length and sampling frequency may vary. For example, if you are recording a hyper-active puppy, you may want a frame every few millsecounds; but if you record a sloth, this is a different situation: -Review Lab 3! - +![flash](https://media.tenor.com/lWXg5ivpQUQAAAAC/zootopia-flash.gif) + ### Grading/Submission You are scored based on the following: -- 60 points for a containerized end to end appliation -- 10 points for using a user defined network in the cloud (automatic if using k8s on the cloud side) -- 10 points for using Kubernetes on your Jetson -- 10 points for explaining the MQTT topics and the QoS that you used. -- 10 points for storing your faces in publicly reachable object storage -- 10 bonus points for using Kubernetes instead of Docker on the cloud side. +- 10 points for problem statement. +- 15 points for curating and sharing videos. +- 50 points for end-to-end ML pipeline. +- 15 points for storing your faces in publicly reachable object storage. +- 10 points for the write-up about the pipeline output. +- 10 bonus points 🎉🎉 for trying other object detection models and comparing results. +# Turn In What to submit to ISVC: - -A link to the repository of your for this homework [private repo please] which should include your code, Dockerfiles, Docker command used, and Kubernetes YAML files. In addition, the answers to the 2 questions (eg. bullet point #4 above) should be included. - -A publicly accessble http link to the location of your faces in the object storage. +- A link to the GitHub repository of your for this homework [private repo please] which should include your code and configuration files. +- A publicly accessble http link to the location of your auto-uploaded objects in the cloud storage and a separate link(s) for maunally uploaded videos. +- A document with problem statement and data collection approach (can be part of the repo as README file or a PDF) and a summary of your findings in this exercise (e.g. how good did the model do on your video? if it didn't do very good, can you hypothesize on the reasons for it - is it data quality issues, such as poor lighting conditions/low resolution/object occlusion, or was it poor performance of the model itself?) diff --git a/week04/README.md b/week04/README.md index 2a86109..d794402 100755 --- a/week04/README.md +++ b/week04/README.md @@ -1,9 +1,12 @@ ### Lecture 4: Deep Learning 101 In this lecture, we will talk about the basics of DL. -The definition: how is it different from AI and ML? Artificial Neurons, Neural Layers. Feed Forward networks. Multi-layer Perceptron. Back propagation. Normalization. Regularization. +The definition: how is it different from AI and ML? Artificial Neurons, Neural Layers. Feed Forward networks. Multi-layer Perceptron. Back propagation. Normalization. Regularization. For the practical task, we will study basics of NLP: text classification, tokenization, evaluation metrics, architecture of transformers. #### Reading: - A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay, Leslie N. Smith - Bag of Tricks for Image Classification with Convolutional Neural Networks, Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li - The [1Cycle Policy](https://sgugger.github.io/the-1cycle-policy.html), Sylvain Gugger +- [Using Transformers](https://huggingface.co/course/chapter2/1?fw=pt) - Pytorch [documentation](https://pytorch.org/docs/stable/index.html ) +- [Text classification task](https://paperswithcode.com/task/text-classification) +- [Jigsaw toxicity dataset (for the homework)](https://huggingface.co/datasets/jigsaw_toxicity_pred) diff --git a/week04/hw/README.md b/week04/hw/README.md index b846999..df54788 100644 --- a/week04/hw/README.md +++ b/week04/hw/README.md @@ -14,4 +14,9 @@ Try things like, - Add a small weight decay to your [Adam optimiser](https://pytorch.org/docs/stable/optim.html). After you are happy with the results, download the notebook as a html and submit it to ISVC, together with the highest AUC score you achieved. -You can get started [here](https://colab.research.google.com/drive/1kqbgfc1Lv3DXP6EdbpfQtMCsTHFR3wHA?usp=sharing). \ No newline at end of file +You can get started [here](https://colab.research.google.com/drive/1kqbgfc1Lv3DXP6EdbpfQtMCsTHFR3wHA?usp=sharing). + +## Turn-in +- Run the notebook, make sure that execution of all cells is complete, then save as HTML or PDF and upload to ISVC. We highly encourage you to create well structured and self-documented notebooks, with explanations in code comments and markdown. +- Complete the quiz https://huggingface.co/course/chapter2/8?fw=pt and download the results as well. + diff --git a/week05/Readme.md b/week05/Readme.md index 1812405..70afbf5 100644 --- a/week05/Readme.md +++ b/week05/Readme.md @@ -5,6 +5,6 @@ Graph Mode. Inference Runtimes. TensorFlow 1.0, TensorFlow 2.0, PyTorch, PyTorch Reading: -* PyTorch 60-minute blitz (training, glance through) -* PyTorch Lightning 1.0 (overview, glance through) -* Transfer Learning in NLP (presentation, glance through) +* PyTorch 60-minute blitz (training, glance through): https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html +* PyTorch Lightning 1.0 (overview, glance through): https://www.pytorchlightning.ai/ +* Transfer Learning in NLP (presentation, glance through): https://huggingface.co/course/chapter3/1?fw=pt diff --git a/week05/hw/Readme.md b/week05/hw/Readme.md index f6d1060..e7db6c4 100644 --- a/week05/hw/Readme.md +++ b/week05/hw/Readme.md @@ -35,11 +35,9 @@ The steps are roughly as follows: ### Please note * Please do not attempt to spend more than 3 days training your model on a single T4 GPU. If your estimate gives you a longer training time, pick a different approach. -* You might want to prototype your work using Jupyter and then submit it using [papermill](https://papermill.readthedocs.io/en/latest/usage-cli.html) ### Extra credit Create your own model architecture. You can draw your inspiration from the [PyTorch Resnet github](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py), for instance. - ### To turn in Please turn in your training logs. They should obviously display that you have achieved the Top 1 accuracy. Also, please save / download the trained weights to your jetson device for evaluation later. diff --git a/week05/lab/Readme.md b/week05/lab/Readme.md index 99ef007..b98ea87 100644 --- a/week05/lab/Readme.md +++ b/week05/lab/Readme.md @@ -1,60 +1,25 @@ # Lab 5 - Deep Learning Frameworks. Training image classifiers -This week we will practice writing basic training loops. If you have a Jetson NX, please use it; if not, the Nano is too underpowered for much of this, so please use Google Colab. - -The Jetson NX device is far less powerful than the Cloud GPUs. But, it has 8G of accessible GPU memory and is quite modern (compatible with the instruction set present in Volta GPUs), so it's likely faster than your CPU laptop. Also, the code you write will be compatible; you could just run it "as is" on more powerful machines. - -The Jetson Nano SoCs are based on the older, Maxwell, cores and don't have the Tensor cores. They also have less GPU memory - 2G or 4G. This forces us to reduce batch sizes when training and use smaller models or risk OOM errors. - -### 1a. Setup (Jetson NX) -Pull the latest ml container for the jetson from NGC and start it, passing through the GPU, port 8888, the drive where you keep your data. Make sure that the version you pull matches your Jetpack. For instance, if you have 32.6 installed: -``` -docker pull nvcr.io/nvidia/l4t-ml:r32.6.1-py3 -``` -This default image contains TF2, PyTorch, Jupyter, as well a few common data science libraries. If you want to see how this image was build, look [here](https://github.com/dusty-nv/jetson-containers). - -Start Jupyter Lab. Note that the default password is `nvidia` - -### 1b. Setup (Google Colab) +This week we will practice writing basic training loops. We need GPUs for this and will use Google Colab. To enable GPU in your notebook, select the following menu options − "Runtime -> Change runtime type". In the dropdown of "None/GPU/CPU", select "GPU" so that your notebook can use free GPU during processing. + +### 1. Setup The easiest way to use Google Colab is to install the [Google Colab Chrome plugin](https://chrome.google.com/webstore/detail/open-in-colab/iogfkhleblhcpcekbiedikdehleodpjo?hl=en_). Once installed, you can navigate to the desired Jupyter notebook in Github, and then click the Extensions icon to the right of the URL location in the browser and select 'Open in Colab' - -### 2a. Validate that a GPU is available (Jetson NX) -Please recall or look up the PyTorch and TensorFlow 2 commands that tell you whether you have at least one GPU available. It's a good idea to check because some repos will silently fall through to the CPU and run slowly. If your GPU was not correctly passed through, go back and re-run the container. - -### 2b. Validate that a GPU is available (Google Colab) + +### 2. Validate that a GPU is available Please recall or look up the PyTorch and TensorFlow 2 commands that tell you whether you have at least one GPU available. It's a good idea to check because some repos will silently fall through to the CPU and run slowly. If your GPU is not detected, click Runtime and Change Runtime type. How can you see the model of yor GPU? - -### 3. Install Papermill (Jetson NX only) -Please review this section as it will be useful during the homework assignment. - -[Papermill](https://papermill.readthedocs.io/en/latest/) comes in handy when you want to run your Jupyter notebooks programmatically, on the command line. This allows you to prototype locally and then submit your work with no changes to run on more powerful remote machines. Open up a terminal inside your Jupyter Lab and install papermill: -``` -pip3 install papermill -``` -### 4. Designate a parameters cell (Jetson NX only) -Please review this section as it will be useful during the homework assignment. - -In your [lab template](https://github.com/MIDS-scaling-up/v3/blob/main/week05/lab/cifar_lab.ipynb) Jupyter notebook, tag one of the cells to be the parameter cell. At the moment, the ml image contains JupyterLab 2.2.9, so follow the instructions [here](https://papermill.readthedocs.io/en/latest/usage-parameterize.html#jupyterlab-2-0-2-2-x) - -### 5. PyTorch - CIFAR10 classification + +### 3. PyTorch - CIFAR10 classification In this section, we will fill in the provided template and build an image classifier for the CIFAR10 dataset, which contains 60,000 32x32 color images that are sorted into 10 classes. Your goal is to quickly train a classifier from random weights -- and hey, you can do that on your NX! You don't need to train all the way to the end, just run it for a few epochs to get a feel for how well it converges. Make sure that your _validation_ loss declines. Make sure that you adjust the model architecture to the number of classes in CIFAR10 (10)! -### 6. PyTorch - CINIC10 dataset +### 4. PyTorch - CINIC10 dataset The [CINIC](https://github.com/BayesWatch/cinic-10) dataset bridges us to the goal of our homework - training on ImageNet. These images are still 32x32, but you'll have to download this dataset separately -- we recommend pulling it from the [Kaggle Dataset](https://www.kaggle.com/mengcius/cinic10) because it's a lot faster. You'll need to uncompress it and then modify your code to use it instead of CIFAR10. Hint: use `datasets.ImageFolder`. Same as before, you don't need to run it forever, just run for a few epochs. Google Colab notes: * You will need your Google Drive account so that you can store the results of your work as well as some persistent files (such as creds) -* make sure you set up your [Kaggle API](https://www.kaggle.com/docs/api). +* Make sure you set up your [Kaggle API](https://www.kaggle.com/docs/api). * Download kaggle.json and store it in your Google Drive someplace (e.g. under MyDrive/.kaggle) * When you run in your Colab notebook, copy this file to /root/.kaggle/kaggle.json and make sure the permissions on it are 600 * Now you should be able to use kaggle CLI commands (e.g. to download datasets). You can bake all these commands into your notebooks. -### 7. Run with papermill (Jetson NX only) -In this step, we will test that our notebook can be invoked programmatically, e.g. -``` -papermill notebook.ipynb output.ipynb -p param1 -p param2 ... -``` -the `-p` directives will override the defaults provided in your jupyter cell - -### 8. PyTorch Lightning +### 5. PyTorch Lightning Now let's reformat this code for Pytorch Lightning! Fill in [this template](https://github.com/MIDS-scaling-up/v3/blob/main/week05/lab/cifar_lightning_lab.ipynb) with your code. diff --git a/week06/README.md b/week06/README.md index d3cb1d8..f60667f 100644 --- a/week06/README.md +++ b/week06/README.md @@ -3,6 +3,8 @@ GStreamer and model optizations ### GStreamer +GStreamer is an extremely powerful and versatile framework for creating streaming media applications. Many of the virtues of the GStreamer framework come from its modularity: GStreamer can seamlessly incorporate new plugin modules. But because modularity and power often come at a cost of greater complexity, writing new applications is not always easy. + https://gstreamer.freedesktop.org/documentation/tutorials/basic/gstreamer-tools.html?gi-language=c @@ -17,8 +19,6 @@ https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Overview.html#nv ### Quantization and model optizmation -https://github.com/dusty-nv/jetson-inference - https://www.tensorflow.org/model_optimization/guide/quantization/training https://www.tensorflow.org/lite/performance/model_optimization @@ -27,9 +27,7 @@ https://www.tensorflow.org/lite/performance/post_training_quantization https://www.tensorflow.org/lite/performance/post_training_float16_quant -https://www.tensorflow.org/lite/performance/post_training_quant - -https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html +https://www.tensorflow.org/lite/performance/post_training_quant https://pytorch.org/blog/introduction-to-quantization-on-pytorch/ diff --git a/week06/demo/gstreamer/README.md b/week06/demo/gstreamer/README.md deleted file mode 100644 index 45111ca..0000000 --- a/week06/demo/gstreamer/README.md +++ /dev/null @@ -1,277 +0,0 @@ -# Getting started Gstreamer on the Jetson NX - -The following is a simple introduction and demo to GStreamer, specfically on the Jetson Xavier NX platform. Additional documentation can be at https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/accelerated_gstreamer.html - -The majority of this examples will be use the GStreamer tool gst-launch-1.0. - -Note, on Jetson devices the "automatic video sink", `autovideosink` is mapped to a sink that is an overlay. This means that the sink is not X-windows enabled and doesn't play well with VNC. As most of these examples are executed on a machine that uses X-windows and potentiall accessed via VNC, this sink will not be used. Instead, nv3dsink or nveglglessink will be used explictly. - - -## VNC -This demo may be used via VNC. If VNC is used, it is strongly recommended to us a reslution less than 4k as resolutions at 4k or higher can cause additional lag when VNC is used. For example, I typically set my resolution to `1600x900` via the command `xrandr` command and have no display pluged into the nx. - -From your a shell: -``` -export DISPLAY=:0 -xhost + -sudo xrandr --fb 1600x900 -``` - -## JP Version -The recored demo was done using Jetpack 4.4.1 and after the recording, Jetpack 4.5 was released. There appears to been a change between the versions to either nv3dsink and/or nvvidconv. Luckly the workaround is easy, replace: -``` -nvvidconv ! nv3dsink -``` -with -``` -nvvidconv ! 'video/x-raw(memory:NVMM)' ! nv3dsink -``` -This is backwards compatabile with 4.4.1 and the examples here have been updated to reflect this. - -The sink nveglglessink works as expected. - -## Part 1: -Our first pipeline will be a simple video test image. -`` -gst-launch-1.0 videotestsrc ! xvimagesink -`` - -This will display a classic "test pattern". The command is composed of two elements, the `videotestsrc` and a video sink, `xvimagesink`. - -Running `gst-inspect-1.0 videotestsrc` will provide some additional information on the src. One of the properies we can set is the pattern. -`` -gst-launch-1.0 videotestsrc pattern=snow ! xvimagesink and gst-launch-1.0 videotestsrc pattern=ball ! xvimagesink for example. -`` - -Nvidia also provides a couple of its own accellerated plugins: -- nv3dsink: a window-based rendering sink, and based on X11 -- nveglglessink: EGL/GLES video sink -- nvvidconv: a Filter/Converter/Video/Scaler, converts video from one colorspace to another & Resizes -- nvegltransform: tranforms to the EGLImage format. - -Inspecting nv3dsink, we can see that it requires an input of `video/x-raw(memory:NVMM)`. -This is not someting that videotestsrc outputs, so we'll need to use nvvidconv to convert. -Inspecting this, we can see it can take `video/x-raw` and output `video/x-raw(memory:NVMM)`. -``` -gst-launch-1.0 videotestsrc ! nvvidconv ! 'video/x-raw(memory:NVMM)' ! nv3dsink -e -``` - -To use nveglglessink, we'll need to use nvvidconv and nvegltransform, to go from NVMM to EGLImage. -``` -gst-launch-1.0 videotestsrc ! nvvidconv ! nvegltransform ! nveglglessink -e -``` - -Which sink to use? Will it just depends. xvimagesink is often easier to get going, but the nvidia ones provide additional acceration and perforance. - -Note, there are additional Nvidia sinks that may be used, but may not work over technology like VNC, e.g nvdrmvideosink. - -## Part 2: USB Camera -Now that we have some experience, we'll add a camera to the fix. We'll be using a USB camera, which leverages the v4l2src plugin. While not covered here, if you are using a Raspberry Pi Camera, you would need to use the Nvidia plugin `nvarguscamerasrc`. - -This example assumes your camera is using /dev/video0; you may need to adjust depending on your configuration. - -As before, we can take advantage of a varity of sinks. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! xvimagesink - -gst-launch-1.0 v4l2src device=/dev/video0 ! nvvidconv ! 'video/x-raw(memory:NVMM)' ! nv3dsink -e - -gst-launch-1.0 v4l2src device=/dev/video0 ! nvvidconv ! nvegltransform ! nveglglessink -e -``` - -Now that we have access to the cameara, we can explore what we can do. Now the the camera is the limit; if your camera doesn't support 60 FPS and 4K, there is no use asking for it. - -To list all of your cameras, you'll run the command: - -``` -gst-device-monitor-1.0 Video/Source -``` - -And you'll get output similar to this (note, we are only concerned with video/x-raw in this case): -``` - name : UVC Camera (046d:0825) - class : Video/Source - caps : video/x-raw, format=(string)YUY2, width=(int)1280, height=(int)960, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 15/2, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)1280, height=(int)720, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 15/2, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)1184, height=(int)656, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)960, height=(int)720, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)1024, height=(int)576, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)960, height=(int)544, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)800, height=(int)600, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)864, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)800, height=(int)448, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)752, height=(int)416, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)544, height=(int)288, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)432, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)352, height=(int)288, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)320, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)320, height=(int)176, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)176, height=(int)144, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - video/x-raw, format=(string)YUY2, width=(int)160, height=(int)120, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 }; - -``` - -You can see that for this camera, it's format is YUY2, and that our available dimensions and framerates are related. Let's start by asking for 30 FPS. Note, you'll need to use the` X/1` for framerates. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! xvimagesink -``` -Notice that the size of the window as changed. Now what if we want 30 FPS at 1280 x 720? Let's find out. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1,width=1280,height=960 ! xvimagesink -``` -and for me, it fails. If we look above, should be clear why; my camera doesn't support that. - -Let's go the other way and ask for 160x120. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1,width=160,height=120 ! xvimagesink -``` - -And works as expected. Feel free to explore what your camera can do! - -Now what can we do with the video? Say we want to see the image in grayscale. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! videoconvert ! video/x-raw,format=GRAY8 ! videoconvert ! xvimagesink -``` -Why is the extra videoconvert needed? Try: -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! videoconvert ! video/x-raw,format=GRAY8 ! xvimagesink -``` -Fails as we need to make sure the video is in a format that xvimagesink can understand, e.g. BGRx. - -We can do the same the Nvidia plugins: -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! nvvidconv ! 'video/x-raw(memory:NVMM)' ! nvvidconv ! 'video/x-raw,format=GRAY8' ! nvvidconv ! 'video/x-raw(memory:NVMM)' ! nv3dsink -e -``` - -We can also do fun things like flip the image: -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1,width=160,height=120 ! nvvidconv flip-method=rotate-180 ! 'video/x-raw(memory:NVMM)' ! nv3dsink -e -``` - -## Part3: Fun and games - -Now let's play with some "special effects". -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw, framerate=30/1 ! videoconvert ! warptv ! videoconvert ! xvimagesink - -gst-launch-1.0 videotestsrc ! agingtv scratch-lines=15 ! videoconvert ! xvimagesink - -``` -Now add some text to the video: -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw, framerate=30/1 ! videoconvert ! textoverlay text="/device/video0" valignment=bottom halignment=left font-desc="Sans, 40" ! xvimagesink -``` -Now what haveing an image be routed to more than one window. This uses tee and queue. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw, framerate=30/1,width=160,height=120 ! queue ! tee name=t t. ! queue ! xvimagesink sync=false t. ! queue ! videoflip method=horizontal-flip ! xvimagesink sync=false -e -``` - -We can also overlay images on top of eachother. We'll take advantage of `nvcompositor` which is accelerated. - -``` -gst-launch-1.0 nvcompositor name=mix sink_0::xpos=0 sink_0::ypos=0 sink_0::zorder=10 sink_1::xpos=0 sink_1::ypos=0 ! nvegltransform ! nveglglessink videotestsrc ! nvvidconv ! mix.sink_0 v4l2src device=/dev/video0 ! nvvidconv ! 'video/x-raw(memory:NVMM)' ! mix.sink_1 -``` - -## Part 4: Encoding and Decoding -The NX can encode and decode video. -``` -gst-launch-1.0 videotestsrc ! 'video/x-raw, format=(string)I420, width=(int)640, -height=(int)480' ! omxh264enc ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! omxh264dec ! nveglglessink -e -``` - -Take a look at jtop and you can see that hardware accelerators are being used. -``` -jtop -``` -Let's encode to file. - -``` -gst-launch-1.0 videotestsrc ! \ - 'video/x-raw, format=(string)I420, width=(int)640, \ - height=(int)480' ! omxh264enc ! \ - 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! \ - qtmux ! filesink location=test.mp4 -e -``` -Or with your video... -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! nvvidconv ! omxh264enc ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! qtmux ! filesink location=test.mp4 -e -``` - -And playing it back is simple: -``` -gst-launch-1.0 filesrc location=test.mp4 ! qtdemux ! queue ! h264parse ! nvv4l2decoder ! nv3dsink -e -``` - -## Part 5: Streaming - -This is a simple streaming example, with both sides running on the NX. This will require two shell windows. - -This works with the latest Jetpack. - -In the first window, run the following: -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1,width=640,height=480 ! nvvidconv ! 'video/x-raw(memory:NVMM), format=NV12' ! nvv4l2h264enc insert-sps-pps=true ! h264parse ! rtph264pay pt=96 ! udpsink host=127.0.0.1 port=8001 sync=false -e -``` -This starts the "server" broadcasting the packets (udp) to the IP Address 127.0.01 on port 8001. The server broadcasts the stream using RTP that hs h264 ecnoded. - -In the second window, run the following: -``` - gst-launch-1.0 udpsrc address=127.0.0.1 port=8001 caps='application/x-rtp, encoding-name=(string)H264, payload=(int)96' ! rtph264depay ! queue ! h264parse ! nvv4l2decoder ! nv3dsink sync=false -e -``` -This listens for the packets and decodes the RTP stream and displays it on the screen. - - - -## Part 6: Python and OpenCV -We can leverage gstreamer from within python and openCV. - -``` -import numpy as np -import cv2 - -# use gstreamer for video directly; set the fps -camSet='v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! videoconvert ! video/x-raw, format=BGR ! appsink' -cap= cv2.VideoCapture(camSet) - -#cap = cv2.VideoCapture(0) - -while(True): - # Capture frame-by-frame - ret, frame = cap.read() - - # Display the resulting frame - cv2.imshow('frame',frame) - if cv2.waitKey(1) & 0xFF == ord('q'): - break - -# When everything done, release the capture -cap.release() -cv2.destroyAllWindows() -``` -You can also leverage features of gstreamer, for example we can add warptv. -``` -import numpy as np -import cv2 - -# use gstreamer for video directly; set the fps -camSet='v4l2src device=/dev/video0 ! video/x-raw, framerate=30/1 ! videoconvert ! warptv ! videoconvert ! appsink' -cap= cv2.VideoCapture(camSet) - -#cap = cv2.VideoCapture(0) - -while(True): - # Capture frame-by-frame - ret, frame = cap.read() - - # Display the resulting frame - cv2.imshow('frame',frame) - if cv2.waitKey(1) & 0xFF == ord('q'): - break - -# When everything done, release the capture -cap.release() -cv2.destroyAllWindows() -``` - diff --git a/week06/demo/quantization/tf-trt/Dockerfile b/week06/demo/quantization/tf-trt/Dockerfile deleted file mode 100644 index 45bff72..0000000 --- a/week06/demo/quantization/tf-trt/Dockerfile +++ /dev/null @@ -1,15 +0,0 @@ -FROM nvcr.io/nvidia/l4t-tensorflow:r32.6.1-tf2.5-py3 -RUN apt-get update && apt-get install python3-dev git sudo unzip -y -RUN pip3 install -U pip -RUN pip3 install cffi setuptools pillow matplotlib notebook jetson-stats -WORKDIR /app/tf-trt -COPY *.sh ./ -RUN sh install_protobuf-3.13.0.sh -RUN cp -R /usr/local/lib/python3.6/dist-packages/protobuf-3.13.0-py3.6-linux-aarch64.egg/google/protobuf /usr/local/lib/python3.6/dist-packages/google/ - -COPY tf-trt.ipynb tf-trt.ipynb - -# To Run : sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.4.4 - -CMD ["jupyter", "notebook", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] - diff --git a/week06/demo/quantization/tf-trt/Dockerfile.jp45 b/week06/demo/quantization/tf-trt/Dockerfile.jp45 deleted file mode 100644 index 958cc7b..0000000 --- a/week06/demo/quantization/tf-trt/Dockerfile.jp45 +++ /dev/null @@ -1,14 +0,0 @@ -FROM nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf2.3-py3 -RUN apt-get update && apt-get install python3-dev git sudo unzip -y -RUN pip3 install -U pip -RUN pip3 install cffi setuptools pillow matplotlib notebook jetson-stats -WORKDIR /app/tf-trt -COPY *.sh ./ -RUN sh install_protobuf-3.13.0.sh -RUN cp -R /usr/local/lib/python3.6/dist-packages/protobuf-3.13.0-py3.6-linux-aarch64.egg/google/protobuf /usr/local/lib/python3.6/dist-packages/google/ - -COPY tf-trt.ipynb tf-trt.ipynb - -# To Run : sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.4.4 - -CMD ["jupyter", "notebook", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"] diff --git a/week06/demo/quantization/tf-trt/README.md b/week06/demo/quantization/tf-trt/README.md deleted file mode 100644 index 32b3cba..0000000 --- a/week06/demo/quantization/tf-trt/README.md +++ /dev/null @@ -1,37 +0,0 @@ -# TensorFlow with TensorRT (TF-TRT) - -This is a very simple image classification example based on https://github.com/tensorflow/tensorrt/tree/master/tftrt/examples/image_classification updated to run on the Jetson Xavier NX. You'll learn how to use TensorFlow 2.x to convert a Keras model to three tf-trt models, a fp32, fp16, and int8. A simple set of test images will be used to both validate and benchmark both the native model and the three tf-trt ones. - -## Jetpack Version -If using Jetpack 4.5, you'll need to build with the file Dockerfile.jp45. Note, at this time, JP4.5 is giving about 1/2 the performance as the JP4.4.1 version. - -## Running -This demo is made available as via a Dockerfile. The image built from this demo leverages Nvidia's TensorFlow 2.x build (see https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html ) and requires an update to Protobuf. There appears to be an issue with the python implementation protobuf (see https://jkjung-avt.github.io/tf-trt-revisited). The current workaround is to build and install a C++ based implemenation. The script install_protobuf-3.13.0.sh will download, build, and install protobuf 3.13.0 - -Assuming you've checked out this repository on your NX, head to the subdirectory `quantization/tf-trt`. From This directory, run the command `docker build -t tf-trt-demo .`. This will take a bit of time to build. - -Once the image is built, you can run the command `docker run -it --rm --net=host tf-trt-demo`. Once the container is running, you'll see output similar to: -``` -[I 21:26:42.668 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret -[I 21:26:43.631 NotebookApp] Serving notebooks from local directory: /app/tf-trt -[I 21:26:43.631 NotebookApp] Jupyter Notebook 6.1.6 is running at: -[I 21:26:43.631 NotebookApp] http://nx:8888/?token=af4be11ce363992a3815f1893de5b4f219940a7fb364040a -[I 21:26:43.631 NotebookApp] or http://127.0.0.1:8888/?token=af4be11ce363992a3815f1893de5b4f219940a7fb364040a -[I 21:26:43.631 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). -[C 21:26:43.647 NotebookApp] - - To access the notebook, open this file in a browser: - file:///root/.local/share/jupyter/runtime/nbserver-1-open.html - Or copy and paste one of these URLs: - http://nx:8888/?token=af4be11ce363992a3815f1893de5b4f219940a7fb364040a - or http://127.0.0.1:8888/?token=af4be11ce363992a3815f1893de5b4f219940a7fb364040a -``` -Navigate to the appropriate URL and open the file `tf-trt.ipynb`. - - -Once the notebook is open, you may run each piece. Note, the flush.sh script is available to clear cached memory if needed. In addition, the notebook restarts a number of points to clear up memroy. - -This image may be pulled (vs built) from the DockerHub registry `rdejana/tf-trt-demo`. - - - diff --git a/week06/demo/quantization/tf-trt/flush.sh b/week06/demo/quantization/tf-trt/flush.sh deleted file mode 100755 index fd76c07..0000000 --- a/week06/demo/quantization/tf-trt/flush.sh +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/sh - -sync; echo 1 > /proc/sys/vm/drop_caches -sync; echo 2 > /proc/sys/vm/drop_caches -sync; echo 3 > /proc/sys/vm/drop_caches diff --git a/week06/demo/quantization/tf-trt/install_protobuf-3.13.0.sh b/week06/demo/quantization/tf-trt/install_protobuf-3.13.0.sh deleted file mode 100644 index 0f19fe9..0000000 --- a/week06/demo/quantization/tf-trt/install_protobuf-3.13.0.sh +++ /dev/null @@ -1,46 +0,0 @@ -#!/bin/bash - -set -e - -folder=${HOME}/src -mkdir -p $folder - -echo "** Install requirements" -sudo apt-get install -y autoconf libtool - -echo "** Download protobuf-3.13.0 sources" -cd $folder -if [ ! -f protobuf-python-3.13.0.zip ]; then - wget https://github.com/protocolbuffers/protobuf/releases/download/v3.13.0/protobuf-python-3.13.0.zip -fi -if [ ! -f protoc-3.13.0-linux-aarch_64.zip ]; then - wget https://github.com/protocolbuffers/protobuf/releases/download/v3.13.0/protoc-3.13.0-linux-aarch_64.zip -fi - -echo "** Install protoc" -unzip protobuf-python-3.13.0.zip -unzip protoc-3.13.0-linux-aarch_64.zip -d protoc-3.13.0 -sudo cp protoc-3.13.0/bin/protoc /usr/local/bin/protoc - -echo "** Build and install protobuf-3.6.1 libraries" -export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp -cd protobuf-3.13.0/ -./autogen.sh -./configure --prefix=/usr/local -make -j$(nproc) -make check -sudo make install -sudo ldconfig - -echo "** Update python3 protobuf module" -# remove previous installation of python3 protobuf module -sudo pip3 uninstall -y protobuf -sudo pip3 install Cython -cd python/ -# force compilation with c++11 standard -sed -i '205s/if v:/if True:/' setup.py -python3 setup.py build --cpp_implementation -python3 setup.py test --cpp_implementation -sudo python3 setup.py install --cpp_implementation - -echo "** Build protobuf-3.13.0 successfully" diff --git a/week06/demo/quantization/tf-trt/tf-trt.ipynb b/week06/demo/quantization/tf-trt/tf-trt.ipynb deleted file mode 100644 index fb5d69d..0000000 --- a/week06/demo/quantization/tf-trt/tf-trt.ipynb +++ /dev/null @@ -1,1027 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "view-in-github" - }, - "source": [ - "\"Open" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "dR1W9kv7IPhE" - }, - "outputs": [], - "source": [ - "# Copyright 2019 NVIDIA Corporation. All Rights Reserved.\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# http://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License.\n", - "# ==============================================================================" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Yb3TdMZAkVNq" - }, - "source": [ - "# TF-TRT Inference from Keras Model with TensorFlow 2.0\n", - "\n", - "\n", - "## Introduction\n", - "\n", - "This example demonstrates the optimization of a Keras model using Nvidia's TensorRT library via TensorFlow integration with TensorRT (TF-TRT). TF-TRT optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. While you can still use TensorFlow's wide and flexible feature set, TensorRT will parse the model and apply optimizations to the portions of the graph wherever possible.\n", - "\n", - "In this notebook, we demonstrate the process of creating a TF-TRT optimized model from a ResNet-50 Keras saved model and demonstrate the performance of models optimized using FP32, FP16, and INT8 precision.\n", - "\n", - "This demo designed to work with the Xavier NX.\n", - "\n", - "Note, during the running of this notebook, the kernel will be restarted serveral times to free up memory. This is done via the command:\n", - "```\n", - "os.kill(os.getpid(), 9)\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import tensorflow as tf\n", - "from tensorflow.python.client import device_lib\n", - "\n", - "def check_tensor_core_gpu_present():\n", - " local_device_protos = device_lib.list_local_devices()\n", - " for line in local_device_protos:\n", - " if \"compute capability\" in str(line):\n", - " compute_capability = float(line.physical_device_desc.split(\"compute capability: \")[-1])\n", - " if compute_capability>=7.0:\n", - " return True\n", - "\n", - " \n", - "print(\"Tensorflow version: \", tf.version.VERSION)\n", - "print(\"Tensor Core GPU Present:\", check_tensor_core_gpu_present())\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "v-R2iN4akVOi" - }, - "source": [ - "## Data\n", - "\n", - "We download several random images for testing from the Internet." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "tVJ2-8rokVOl", - "scrolled": true - }, - "outputs": [], - "source": [ - "!mkdir ./data\n", - "!wget -O ./data/img0.JPG \"https://d17fnq9dkz9hgj.cloudfront.net/breed-uploads/2018/08/siberian-husky-detail.jpg?bust=1535566590&width=630\"\n", - "!wget -O ./data/img1.JPG \"https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg\"\n", - "!wget -O ./data/img2.JPG \"https://www.artis.nl/media/filer_public_thumbnails/filer_public/00/f1/00f1b6db-fbed-4fef-9ab0-84e944ff11f8/chimpansee_amber_r_1920x1080.jpg__1920x1080_q85_subject_location-923%2C365_subsampling-2.jpg\"\n", - "!wget -O ./data/img3.JPG \"https://www.familyhandyman.com/wp-content/uploads/2018/09/How-to-Avoid-Snakes-Slithering-Up-Your-Toilet-shutterstock_780480850.jpg\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "nWYufTjPCMgW" - }, - "source": [ - "### Setting up" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "Yyzwxjlm37jx" - }, - "outputs": [], - "source": [ - "from __future__ import absolute_import, division, print_function, unicode_literals\n", - "import os\n", - "import time\n", - "\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "\n", - "import tensorflow as tf\n", - "from tensorflow import keras\n", - "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n", - "from tensorflow.python.saved_model import tag_constants\n", - "from tensorflow.keras.applications.resnet50 import ResNet50\n", - "from tensorflow.keras.preprocessing import image\n", - "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 269 - }, - "colab_type": "code", - "id": "F_9n-AR1kVOv", - "outputId": "e0ead6dc-e761-404e-a030-f6d3057a57da" - }, - "outputs": [], - "source": [ - "from tensorflow.keras.preprocessing import image\n", - "\n", - "fig, axes = plt.subplots(nrows=2, ncols=2)\n", - "\n", - "for i in range(4):\n", - " img_path = './data/img%d.JPG'%i\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " plt.subplot(2,2,i+1)\n", - " plt.imshow(img);\n", - " plt.axis('off');" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "xeV4r2YTkVO1" - }, - "source": [ - "## Model\n", - "\n", - "We next download and test a ResNet-50 pre-trained model from the Keras model zoo and demonstrate that it is able to classifly our images." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 73 - }, - "colab_type": "code", - "id": "WwRBOikEkVO3", - "outputId": "2d63bc46-8bac-492f-b519-9ae5f19176bc" - }, - "outputs": [], - "source": [ - "model = ResNet50(weights='imagenet')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 410 - }, - "colab_type": "code", - "id": "lFKQPoLO_ikd", - "outputId": "c0b93de8-c94b-4977-992e-c780e12a3d52" - }, - "outputs": [], - "source": [ - "for i in range(4):\n", - " img_path = './data/img%d.JPG'%i\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - "\n", - " preds = model.predict(x)\n", - " # decode the results into a list of tuples (class, description, probability)\n", - " # (one such list for each sample in the batch)\n", - " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", - "\n", - " plt.subplot(2,2,i+1)\n", - " plt.imshow(img);\n", - " plt.axis('off');\n", - " plt.title(decode_predictions(preds, top=3)[0][0][1])\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "XrL3FEcdkVPA" - }, - "source": [ - "TF-TRT takes input as a TensorFlow saved model, therefore, we re-export the Keras model as a TF saved model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 110 - }, - "colab_type": "code", - "id": "WxlUF3rlkVPH", - "outputId": "9f3864e7-f211-4c06-d2d2-585c1a477e34" - }, - "outputs": [], - "source": [ - "# Save the entire model as a SavedModel.\n", - "model.save('resnet50_saved_model') " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "qBQwBvlNm-J8" - }, - "source": [ - "### Inference with native TF2.0 saved model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "8zLN0GMCkVPe" - }, - "outputs": [], - "source": [ - "model = tf.keras.models.load_model('resnet50_saved_model')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 219 - }, - "colab_type": "code", - "id": "Fbj-UEOxkVPs", - "outputId": "3a2b34f9-8034-48cb-b3fe-477f09966025" - }, - "outputs": [], - "source": [ - "img_path = './data/img0.JPG' # Siberian_husky\n", - "img = image.load_img(img_path, target_size=(224, 224))\n", - "x = image.img_to_array(img)\n", - "x = np.expand_dims(x, axis=0)\n", - "x = preprocess_input(x)\n", - "\n", - "preds = model.predict(x)\n", - "# decode the results into a list of tuples (class, description, probability)\n", - "# (one such list for each sample in the batch)\n", - "print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", - "plt.subplot(2,2,1)\n", - "plt.imshow(img);\n", - "plt.axis('off');\n", - "plt.title(decode_predictions(preds, top=3)[0][0][1])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 35 - }, - "colab_type": "code", - "id": "CGc-dC6DvwRP", - "outputId": "e0a22e05-f4fe-47b6-93e8-2b806bf7098a" - }, - "outputs": [], - "source": [ - "batch_size = 8\n", - "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n", - "\n", - "for i in range(batch_size):\n", - " img_path = './data/img%d.JPG' % (i % 4)\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " batched_input[i, :] = x\n", - "batched_input = tf.constant(batched_input)\n", - "print('batched_input shape: ', batched_input.shape)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "rFBV6hQR7N3z" - }, - "outputs": [], - "source": [ - "# Benchmarking throughput\n", - "N_warmup_run = 50\n", - "N_run = 1000\n", - "elapsed_time = []\n", - "\n", - "for i in range(N_warmup_run):\n", - " preds = model.predict(batched_input)\n", - "\n", - "for i in range(N_run):\n", - " start_time = time.time()\n", - " preds = model.predict(batched_input)\n", - " end_time = time.time()\n", - " elapsed_time = np.append(elapsed_time, end_time - start_time)\n", - " if i % 50 == 0:\n", - " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n", - "\n", - "print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's reset to clean up our memory" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.kill(os.getpid(), 9)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "vC_RN0BAkVPy" - }, - "source": [ - "### TF-TRT FP32 model\n", - "\n", - "We first convert the TF native FP32 model to a TF-TRT FP32 model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from __future__ import absolute_import, division, print_function, unicode_literals\n", - "import os\n", - "import time\n", - "\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "\n", - "import tensorflow as tf\n", - "from tensorflow import keras\n", - "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n", - "from tensorflow.python.saved_model import tag_constants\n", - "from tensorflow.keras.applications.resnet50 import ResNet50\n", - "from tensorflow.keras.preprocessing import image\n", - "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions\n", - "\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 126 - }, - "colab_type": "code", - "id": "0eLImSJ-kVPz", - "outputId": "e2c353c7-8e4b-49aa-ab97-f4d82797d4d8" - }, - "outputs": [], - "source": [ - "print('Converting to TF-TRT FP32...')\n", - "\n", - "# max_workspace_size_bytes sets how much GPU memory will be avaible at runtime\n", - "# what happens if you make max value bigger (say 8000000000) or smaller (say 1000000000)?\n", - "max = 3000000000\n", - "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP32,\n", - " max_workspace_size_bytes=max)\n", - "\n", - "converter = trt.TrtGraphConverterV2(input_saved_model_dir='resnet50_saved_model',\n", - " conversion_params=conversion_params)\n", - "converter.convert()\n", - "converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP32')\n", - "print('Done Converting to TF-TRT FP32')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Vd2DoGUp8ivj" - }, - "source": [ - "Next, we load and test the TF-TRT FP32 model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "rf97K_rxvwRm" - }, - "outputs": [], - "source": [ - "batch_size = 8\n", - "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n", - "\n", - "for i in range(batch_size):\n", - " img_path = './data/img%d.JPG' % (i % 4)\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " batched_input[i, :] = x\n", - "batched_input = tf.constant(batched_input)\n", - "print('batched_input shape: ', batched_input.shape)\n", - "\n", - "def predict_tftrt(saved_model_loaded):\n", - " \"\"\"Runs prediction on a single image and shows the result.\n", - " input_saved_model (string): Name of the input model stored in the current dir\n", - " \"\"\"\n", - " \n", - " signature_keys = list(saved_model_loaded.signatures.keys())\n", - " print(signature_keys)\n", - "\n", - " infer = saved_model_loaded.signatures['serving_default']\n", - " print(infer.structured_outputs)\n", - "\n", - "\n", - " for i in range(4):\n", - " img_path = './data/img%d.JPG'%i\n", - " #img_path = './data/img0.JPG' # Siberian_husky\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " x = tf.constant(x)\n", - " \n", - " labeling = infer(x)\n", - " preds = labeling['predictions'].numpy()\n", - " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", - " plt.subplot(2,2,i+1)\n", - " plt.imshow(img);\n", - " plt.axis('off');\n", - " plt.title(decode_predictions(preds, top=3)[0][0][1])\n", - " \n", - "def benchmark_tftrt(saved_model_loaded):\n", - " # saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n", - " infer = saved_model_loaded.signatures['serving_default']\n", - "\n", - " N_warmup_run = 50\n", - " N_run = 1000\n", - " elapsed_time = []\n", - "\n", - " for i in range(N_warmup_run):\n", - " labeling = infer(batched_input)\n", - "\n", - " for i in range(N_run):\n", - " start_time = time.time()\n", - " labeling = infer(batched_input)\n", - " end_time = time.time()\n", - " elapsed_time = np.append(elapsed_time, end_time - start_time)\n", - " if i % 50 == 0:\n", - " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n", - "\n", - " print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## load the model\n", - "\n", - "saved_model_loaded = tf.saved_model.load('resnet50_saved_model_TFTRT_FP32', tags=[tag_constants.SERVING])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 238 - }, - "colab_type": "code", - "id": "pRK0pRE-snvb", - "outputId": "1f7ab6c1-dbfa-4e3e-a21d-df9975c70455" - }, - "outputs": [], - "source": [ - "predict_tftrt(saved_model_loaded)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "ai6bxNcNszHc" - }, - "outputs": [], - "source": [ - "benchmark_tftrt(saved_model_loaded)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.kill(os.getpid(), 9)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "G2F8t6cPkVQS" - }, - "source": [ - "### TF-TRT FP16 model\n", - "We next convert the native TF FP32 saved model to TF-TRT FP16 model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from __future__ import absolute_import, division, print_function, unicode_literals\n", - "import os\n", - "import time\n", - "\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "\n", - "import tensorflow as tf\n", - "from tensorflow import keras\n", - "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n", - "from tensorflow.python.saved_model import tag_constants\n", - "from tensorflow.keras.applications.resnet50 import ResNet50\n", - "from tensorflow.keras.preprocessing import image\n", - "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 126 - }, - "colab_type": "code", - "id": "0ia_AlSDkVQT", - "outputId": "d29eb6de-101b-4b9a-8ebf-4880a2e469bf" - }, - "outputs": [], - "source": [ - "print('Converting to TF-TRT FP16...')\n", - "\n", - "# max_workspace_size_bytes sets how much GPU memory will be avaible at runtime\n", - "# what happens if you make max value bigger (say 8000000000) or smaller (say 1000000000)?\n", - "max = 3000000000\n", - "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(\n", - " precision_mode=trt.TrtPrecisionMode.FP16,\n", - " max_workspace_size_bytes=max)\n", - "converter = trt.TrtGraphConverterV2(\n", - " input_saved_model_dir='resnet50_saved_model', conversion_params=conversion_params)\n", - "converter.convert()\n", - "converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP16')\n", - "print('Done Converting to TF-TRT FP16')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "batch_size = 8\n", - "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n", - "\n", - "for i in range(batch_size):\n", - " img_path = './data/img%d.JPG' % (i % 4)\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " batched_input[i, :] = x\n", - "batched_input = tf.constant(batched_input)\n", - "print('batched_input shape: ', batched_input.shape)\n", - "\n", - "def predict_tftrt(saved_model_loaded):\n", - " \"\"\"Runs prediction on a single image and shows the result.\n", - " input_saved_model (string): Name of the input model stored in the current dir\n", - " \"\"\"\n", - " \n", - " signature_keys = list(saved_model_loaded.signatures.keys())\n", - " print(signature_keys)\n", - "\n", - " infer = saved_model_loaded.signatures['serving_default']\n", - " print(infer.structured_outputs)\n", - "\n", - "\n", - " for i in range(4):\n", - " img_path = './data/img%d.JPG'%i\n", - " #img_path = './data/img0.JPG' # Siberian_husky\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " x = tf.constant(x)\n", - " \n", - " labeling = infer(x)\n", - " preds = labeling['predictions'].numpy()\n", - " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", - " plt.subplot(2,2,i+1)\n", - " plt.imshow(img);\n", - " plt.axis('off');\n", - " plt.title(decode_predictions(preds, top=3)[0][0][1])\n", - " \n", - "def benchmark_tftrt(saved_model_loaded):\n", - " # saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n", - " infer = saved_model_loaded.signatures['serving_default']\n", - "\n", - " N_warmup_run = 50\n", - " N_run = 1000\n", - " elapsed_time = []\n", - "\n", - " for i in range(N_warmup_run):\n", - " labeling = infer(batched_input)\n", - "\n", - " for i in range(N_run):\n", - " start_time = time.time()\n", - " labeling = infer(batched_input)\n", - " end_time = time.time()\n", - " elapsed_time = np.append(elapsed_time, end_time - start_time)\n", - " if i % 50 == 0:\n", - " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n", - "\n", - " print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## load the model\n", - "\n", - "saved_model_loaded = tf.saved_model.load('resnet50_saved_model_TFTRT_FP16', tags=[tag_constants.SERVING])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 238 - }, - "colab_type": "code", - "id": "yTDbB6DWn0kJ", - "outputId": "33cd23f8-9c8b-4cdd-9c7a-4939aae12f56" - }, - "outputs": [], - "source": [ - "predict_tftrt(saved_model_loaded)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "G_gerN5j9l6U" - }, - "outputs": [], - "source": [ - "benchmark_tftrt(saved_model_loaded)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "oRXVurzh6o5T" - }, - "outputs": [], - "source": [ - "import os\n", - "os.kill(os.getpid(), 9)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "qKSJ-oizkVQY" - }, - "source": [ - "### TF-TRT INT8 model\n", - "\n", - "Creating TF-TRT INT8 model requires a small calibration dataset. This data set ideally should represent the test data in production well, and will be used to create a value histogram for each layer in the neural network for effective 8-bit quantization. \n", - "\n", - "Herein, for demonstration purposes, we take only the 4 images that we downloaded for calibration. In production, this set should be more representative of the production data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "DGXjp3K_6x59" - }, - "outputs": [], - "source": [ - "\n", - "from __future__ import absolute_import, division, print_function, unicode_literals\n", - "import os\n", - "import time\n", - "\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "\n", - "import tensorflow as tf\n", - "from tensorflow import keras\n", - "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n", - "from tensorflow.python.saved_model import tag_constants\n", - "from tensorflow.keras.applications.resnet50 import ResNet50\n", - "from tensorflow.keras.preprocessing import image\n", - "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "batch_size = 8\n", - "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n", - "\n", - "for i in range(batch_size):\n", - " img_path = './data/img%d.JPG' % (i % 4)\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " batched_input[i, :] = x\n", - "batched_input = tf.constant(batched_input)\n", - "print('batched_input shape: ', batched_input.shape)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 201 - }, - "colab_type": "code", - "id": "iM8DshiYkVQe", - "outputId": "d3e88347-e060-4fd2-de2b-79cb38fc8442", - "scrolled": false - }, - "outputs": [], - "source": [ - "print('Converting to TF-TRT INT8...')\n", - "# max_workspace_size_bytes sets how much GPU memory will be avaible at runtime\n", - "# what happens if you make max value bigger (say 8000000000) or smaller (say 1000000000)?\n", - "max = 3000000000\n", - "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(\n", - " precision_mode=trt.TrtPrecisionMode.INT8, \n", - " max_workspace_size_bytes=3000000000, \n", - " use_calibration=True)\n", - "converter = trt.TrtGraphConverterV2(\n", - " input_saved_model_dir='resnet50_saved_model', \n", - " conversion_params=conversion_params)\n", - "\n", - "def calibration_input_fn():\n", - " yield (batched_input, )\n", - "converter.convert(calibration_input_fn=calibration_input_fn)\n", - "\n", - "converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_INT8')\n", - "print('Done Converting to TF-TRT INT8')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "gu0CSNGVvwSY" - }, - "outputs": [], - "source": [ - "\n", - "\n", - "def predict_tftrt(saved_model_loaded):\n", - " \"\"\"Runs prediction on a single image and shows the result.\n", - " input_saved_model (string): Name of the input model stored in the current dir\n", - " \"\"\"\n", - " \n", - " signature_keys = list(saved_model_loaded.signatures.keys())\n", - " print(signature_keys)\n", - "\n", - " infer = saved_model_loaded.signatures['serving_default']\n", - " print(infer.structured_outputs)\n", - "\n", - "\n", - " for i in range(4):\n", - " img_path = './data/img%d.JPG'%i\n", - " #img_path = './data/img0.JPG' # Siberian_husky\n", - " img = image.load_img(img_path, target_size=(224, 224))\n", - " x = image.img_to_array(img)\n", - " x = np.expand_dims(x, axis=0)\n", - " x = preprocess_input(x)\n", - " x = tf.constant(x)\n", - " \n", - " labeling = infer(x)\n", - " preds = labeling['predictions'].numpy()\n", - " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", - " plt.subplot(2,2,i+1)\n", - " plt.imshow(img);\n", - " plt.axis('off');\n", - " plt.title(decode_predictions(preds, top=3)[0][0][1])\n", - " \n", - "def benchmark_tftrt(saved_model_loaded):\n", - " # saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n", - " infer = saved_model_loaded.signatures['serving_default']\n", - "\n", - " N_warmup_run = 50\n", - " N_run = 1000\n", - " elapsed_time = []\n", - "\n", - " for i in range(N_warmup_run):\n", - " labeling = infer(batched_input)\n", - "\n", - " for i in range(N_run):\n", - " start_time = time.time()\n", - " labeling = infer(batched_input)\n", - " end_time = time.time()\n", - " elapsed_time = np.append(elapsed_time, end_time - start_time)\n", - " if i % 50 == 0:\n", - " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n", - "\n", - " print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## load the model\n", - "\n", - "saved_model_loaded = tf.saved_model.load('resnet50_saved_model_TFTRT_INT8', tags=[tag_constants.SERVING])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 238 - }, - "colab_type": "code", - "id": "DSKfqPDGEv7U", - "outputId": "028545e1-26e5-4fb6-930f-7b10af2c82fe" - }, - "outputs": [], - "source": [ - "predict_tftrt(saved_model_loaded)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "2mM9D3BTEzQS" - }, - "outputs": [], - "source": [ - "benchmark_tftrt(saved_model_loaded)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "I13snJ9VkVQh" - }, - "source": [ - "## Conclusion\n", - "In this notebook, we have demonstrated the process of creating TF-TRT FP32, FP16 and INT8 inference models from an original Keras FP32 model, as well as verify their speed and accuracy. \n", - "\n", - "### What's next\n", - "Try changing the batch size for the input and the max_workspace_size_bytes to see if you can get better (or worse) performance. What happens if you change the NX's power modes?\n", - "\n", - "Try TF-TRT on your own model and data, and experience the simplicity and speed up it offers." - ] - } - ], - "metadata": { - "accelerator": "GPU", - "colab": { - "include_colab_link": true, - "machine_shape": "hm", - "name": "Colab-TF20-TF-TRT-inference-from-Keras-saved-model.ipynb", - "provenance": [] - }, - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.9" - } - }, - "nbformat": 4, - "nbformat_minor": 1 -} diff --git a/week06/hw/README.md b/week06/hw/README.md index 6de4598..2f4a017 100644 --- a/week06/hw/README.md +++ b/week06/hw/README.md @@ -1,75 +1,19 @@ # Homework 6 +## Model optimization and quantization -This homework requires a Jetson device. If you do not have a device, please just sumbit the HW answering just questions 2 and 3 from Part 1. +In lab, you saw to how use leverage TensorRT with TensorFlow. For this homework, you'll look at Hugging Face 🤗 Optimum (https://huggingface.co/docs/optimum/main/en/index). Go over training materials provided in the documentation. - -This homework covers some use of GStreamer and model optimization. It builds on the week 6 lab and completing the lab first is hightly recommended. - -This is an ungraded assignment - -## Part 1: GStreamer - -1. In the lab, you used the Ndida sink nv3dsink; Nvidia provides a another sink, nveglglessink. Convert the following sink to use nveglglessink. -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! xvimagesink -``` - -2. What is the difference between a property and a capability? How are they each expressed in a pipeline? - -3. Explain the following pipeline, that is explain each piece of the pipeline, desribing if it is an element (if so, what type), property, or capability. What does this pipeline do? - -``` -gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw, framerate=30/1 ! videoconvert ! agingtv scratch-lines=10 ! videoconvert ! xvimagesink sync=false -``` - -4. GStreamer pipelines may also be used from Python and OpenCV. For example: -``` -import numpy as np -import cv2 - -# use gstreamer for video directly; set the fps -camSet='v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1 ! videoconvert ! video/x-raw, format=BGR ! appsink' -cap= cv2.VideoCapture(camSet) - -#cap = cv2.VideoCapture(0) - -while(True): - # Capture frame-by-frame - ret, frame = cap.read() - - # Display the resulting frame - cv2.imshow('frame',frame) - if cv2.waitKey(1) & 0xFF == ord('q'): - break - -# When everything done, release the capture -cap.release() -cv2.destroyAllWindows() -``` -In the lab, you saw how to stream using Gstreamer. Using the lab and the above example, write a Python application that listens for images streamed from a Gstreamer pipeline. You'll want to make sure your image displays in color. - -For part 1, you'll need to submit: -- Answer to question 1 -- Answer to question 2 -- Answer to question 3 -- Source code and Gstreamer "server" pipeline used. - - -## Part 2: Model optimization and quantization - -In lab, you saw to how use leverage TensorRT with TensorFlow. For this homework, you'll look at another way to levarage TensorRT with Pytorch via the Jetson Inference library (https://github.com/dusty-nv/jetson-inference). - -You'll want to train a custom image classification model, using either the fruit example or your own set of classes. - -Like in the lab, you'll want to first baseline the your model, looking a the number of images per second it can process. You may train the model using your Jetson device and the Jetson Inference scripts or train on a GPU eanabled server/virtual machine. Once you have your baseline, follow the steps/examples outlined in the Jetson Inference to run your model with TensorRT (the defaults used are fine) and determine the number of images per second that are processed. +You first need to train a custom image classification model, using either the fruit example or your own set of classes. Like in the lab, you'll want to first baseline the your model, looking a the number of images per second it can process. You may train the model on a GPU eanabled server/virtual machine. Once you have your baseline, follow the steps/examples from Hugging Face 🤗 Optimum to optimize and quantize the model. You may use either the container apporach or build the library from source. -For part 2, you'll need to submit: +## Turn-in +You'll need to submit: - The base model you used - A description of your data set - How long you trained your model, how many epochs you specified, and the batch size. - Native Pytorch baseline -- TensorRT performance numbers +- Optimized performance numbers +- Code used for optimization diff --git a/week07/hw/README.md b/week07/hw/README.md index 40bdca8..8454544 100644 --- a/week07/hw/README.md +++ b/week07/hw/README.md @@ -14,7 +14,7 @@ The goal of the homework is to develop a model in kaggle (referencing the [Bench You can see an overview of how to use kaggle in the overview page. -There is a public leaderboard which will be visible through the competition. On the deadline (`Midnight UTC, March 2nd`), the challenge will close and the private leaderboard rankings will be revealed!! +# Turn in To submit your homework, you need to submit two things to ISVC. 1. A screen shot of your score on the leaderboard (public leaderboard is ok). diff --git a/week08/Readme.md b/week08/Readme.md index ed9401e..6377a1d 100644 --- a/week08/Readme.md +++ b/week08/Readme.md @@ -4,6 +4,7 @@ Types of datasets. Key public datasets. Cloud platforms for hosting and processi ### Suggested Reading: +* [🤗 Datasets](https://huggingface.co/course/chapter5/1?fw=pt) * [Albumentations](https://github.com/albumentations-team/albumentations) * [Nvidia DALI](https://github.com/NVIDIA/DALI) * [ImageNet](https://image-net.org/)