Skip to content

moxx799/carya_tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 

Repository files navigation

Tutorials of carya at University of Houston

Carya

Step 1: Request for an account and fill time from UH HPC center

See the details with the following link https://uh.edu/rcdc/getting-started/

Step 2: Basic operations

There are two positions for your operation, Your local dictionary and the PI's dictionary. Typically, put the code and Data into your PI's dictionary.

Your local dictionary (Disk space: only 10G):

cd ~

Your Pi's dictionary (By your request, 4T):

cd /project/<PI name>/<your user name>/

If there is no folder with your user name, make it by yourself.

mkdir /project/<PI name>/<your user name>

Here are some common commands, skip them if you are familiar with them.

To remove a file or a folder recursively:

rm <file path>
rm -r <fold path>

other commands: move, compress, and extract files:

mv <file path> <destination path>
tar -cvf <file name.tar.gz> <fold path>
tar -xvf <file name.tar.gz> 

To kill some of the ps, check the ps list first and then kill with the pid:

lsof +D /path 
kill -9 or -15 pid 

Step 3: Please follow the operations with the official operation from UH HPC center

https://uh.edu/research/rcdc/support-and-services/user-guide/getting-started-clusters.php

Quick command to activate the conda env:

module add Miniconda3/py310
source  $(dirname `which python`)/../etc/profile.d/conda.sh
conda activate /project/chen/envs/ocp-models

Here are some operations that usefull and not in this tutorial.

To show the size of the dictionary

du -h <Path> 

To show the file counts of the dictionary

find <Path> -type f | wc -l 

To kill the squeue:

scancel <pid> 

to catch up on a task

cat myjob.o<123456> 

RSA key set up

To authorize your local machine, you can add the RSA key to the server so that you don't need to type the password each time. On your local machine:

ssh-keygen -t rsa -b 4096

You will be prompted to enter the file path where the key will be saved. By default, it is saved in

~/.ssh/id_rsa

Type enter two times to skip the path and passphrase. Then copy the generated public key to the remote server

ssh-copy-id <username>@<remote_host>

Now the settings are finished, you can log in to the server to check if you still need the password. You can find it on

~/.ssh/authorized_keys

You can do more operations on it, such as adding several locals to connect to the server or connect the servers in between. If it does not work, maybe is the permission problem, use the code below.

chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
chmod 755 ~  # Home directory

some git operations:

When you start a new project, using Github as the project manager is highly recommended. How to copy from an existing repo.

git clone -b main --single-branch https://<token>@github.com/<repository>

cd to the folder, and follow the steps below

rm -rf .git
mv ../<old repository name> ../<new name>
git init
git add .
git commit -m "first commit"
git branch -M main

Now, you need to create a new repository on the Github website without creating the .readme, .gitignore or licence.

git remote add origin https://github.com/moxx799/<repository>.git
git push -u origin main

Now, a new repository is created, both exist in your local and remote. After editing the files,

git commit -m "<your commit description>"

It will tell you which files need to be add

git add <files>

commit again

git commit -m "<your commit description>"
git push

To stash the local changes then pull:

git stash
git pull

and delete the stash if wanted or keep the stash:

git stash drop
git stash pop

Jupyter notebook set up

It's not convenient to build the code with the linux system without GUI, jupyter notebook is recommended.

Here are the methods to set up:

  • Set up conda env and activate it, following the UH tutorial
  • Install jupyter
conda install jupyter lab
  • Set up the server
jupyter lab --no-browser --ip=127.0.0.1 --port=88<xx> 

you will see the address like this, but can't open it

http://127.0.0.1:8890/?token=gcf45711b1540d5004eff093e7fe8a511d339b28d6171012

  • Open a new teminal on your local machine, type
ssh -l <username> carya.rcdc.uh.edu -L 88<xx>:127.0.0.1:88<xx>
  • The address above can be a 4-digit number like 8888, replace and change the number by yourself if the port cannot be listened.
  • Afterward, you can login to the address and use the jupyter on your local machine.

To the task that in a compute node,

  • first you need to request a computer node: salloc -t 1:00:00 -n 28 -N 1
  • then you use the no-browser command upon to listen to the port on that node
jupyter lab --no-browser --ip=127.0.0.1 --port=88<xx> 
  • finnally ssh -J <user name>@carya.rcdc.uh.edu -L 8848:127.0.0.1:8848 <username>@compute-0-0
  • so that you can clike the address to use it.
  • To check if the port entry is occupied on your local machine, type
lsof -i :88<xx>

Tensorboard remote setup

Some of the Ml tasks save the logs as tensorboard, you can open it on your local browser as the following:

  1. on local: ssh -L 16006:127.0.0.1:6006 user@server
  2. on server: tensorboard --logdir=<profile-logs> --port=6006 --bind_all
  3. on local browser: login 127.0.0.1:16006 or localhost:16006

Set up remote vscode in compute node rather than login node

VS-Code remote ssh is another option, it contains several functional extensions, you can run it interactively even on a node. ref: microsoft/vscode-remote-release#1722 (comment) ref: https://code.visualstudio.com/docs/remote/tunnels

  1. Install Code CLI curl -Lk 'https://code.visualstudio.com/sha/download?build=stable&os=cli-alpine-x64' --output vscode_cli.tar.gz then tar -xf vscode_cli.tar.gz
  2. Qequest a node salloc -t 1:00:00 -n 24 --gpus=1 -N 1
  3. Create a tunnel by ./code tunnel , select the github and use the code output to connect the account on https://github.com/login/device
  4. Open VSCode on local machine, ctrl+shift+p and type "tunnel" to find the "Remote-Tunnels: Connect to Tunnel..." command. The tunnel I created shows up in the list, click it.
  5. You need to install the extensions when you first launch it.

some other operations as memo

To combine the csv files with same head,Go to the directory containing the CSV files

cd /path/to/csv/files

Concatenate the files, keeping the header from only the first file

head -1 one_of_your_files.csv > merged.csv
tail -n +2 -q *.csv >> merged.csv

Do not need to see the following:




If you do not want to change the contents of the dockerfile, you could use such command to build the image:

docker build -t xubuntu:1.8 https://github.com/moxx799/Docker-file.git
  • Start from pytorch 1.14.0a image:

There are 3 available options:

Option Description Default
BASE_IMAGE The base image for building this desktop image. nvcr.io/nvidia/pytorch:22.12-py3
BASE_LAUNCH The entrypoint script from the base image. If there is no entry script, please use"". /opt/nvidia/nvidia_entrypoint.sh
WITH_CHINESE If set, the image would be built with Chinese support for vscode, sublime and codeblocks. true
WITH_EXTRA_APPS The installed extra applications. Each character represents an app or several apps. For example,cgo represents fully installing Cloudreve, GIMP, LibreOffice and Thunderbird. More details could be referred in the following table. cgo
ADDR_PROXY Set the proxy address pointing to localhost. If specified, this value should be a full address. (Experimental feature ::) unset

Here we show the list of extra apps:

Code Description
c Cloudreve
p PyCharm
g GIMP
k GitKraken
m Sublime Text 4
x TeXLive + TeXstudio
n Nautilus + Nemo
o LibreOffice + Thunderbird
e GNU Emacs

To find your launch script of your base image, use

docker inspect <your-base-image>:<tag>
  • Switch the VNCServer to XTigerVNC (experimental): Add the option --xvnc will make the desktop hosted by the Xvnc program. Everything will be run in the same process. There will be no sub-process manager like tigervncserver to manage desktop related programs. A good thing is that, users do not need to run tigervncserver -kill :1 before saving the image. However, currently these desktop related programs are not guaranteed to be closed if hitting Ctrl+C. Therefore, we suggest the users to use ps -aux to validate the running processes before saving the image.

    After using Ctrl+C to kill the Xvnc program, users can use the following command to relaunch the Xvnc and noVNC services:

    xvnc-launch [--root]
  • With Cloudreve 🔗: We recommend users to launch Cloudreve by opening a new terminal on the desktop, and using the following command:

    crpasswd  # only used for checking the INITIAL admin password.
    cloudreve  # launch Cloudreve service, requires users to expose 5212 port.

    ⚠️ Using Cloudreve requires users to add the extra app c in the option WITH_EXTRA_APPS when building the image.

    ⚠️ We STRONGLY recommend users to change their admin password, and create a non-admin user for using Cloudreve. You can also configure your data exchanging folder.

Features

This is the minimal desktop test based on ubuntu 16.04, 18.04 or 20.04 image, it has:

  • Cloudreve Service (Chinese only) 🔗: a private cloud storage service, allowing users to expose their personal folder as an "online drive" available on LAN. If users are interested, they can dig into the configurations and enable more features (like WebDAV and offline downloading). Currently this feature is designed for using a browser-based app to replace the WinSCP client.

Update records

ver 1.0 @ 08/18/2023

:Initial build

  1. Initial build.

About

Tutorial of carya of UH

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published