Skip to content

Sysadmin Task List

Saurabh Asthana edited this page Apr 24, 2023 · 19 revisions
  • Storage server deployment.
    • Wipe old BeeGFS server.
    • Set up rsync server.
      • Make sure /cache is also backed up via rsync
    • Set up new tape backup.
    • Set up new cloud backup (Azure via IT).
    • Erase old cloud backup.
  • Compute node deployment to C4. Depends on supply chain / buyer.
    • Update ship-to address when PO gets executed.
    • Secure delivery of node
    • Decommission n20 -- do we want to do something with it?
  • New VMWare server. Replaces current servers (too old to be updated). Could be faster CPUs, more RAM, and get hulao redundancy.
    • Requires new network switches to get Swarm private network. Waiting on Super Micro.
  • Add gnomon.ucsf.edu to our SSL cert
  • Enable scrontab or figure out where to schedule cron jobs on the cluster

Things for us to know:

  • Have to renew VMWare licenses every year -- like $200, but hard to tie to a PO. Used CC and got reimbursed.
  • We need to upgrade CentOS systems to, probably, RockyLinux 8 (to stay in lock-step with Wynton)
  • Could submit Slurm jobs from worker nodes (would need configuration)

Obscure questions:

  1. How is the tape database itself backed up/restored?
  2. Is there an API to the tape database to query file status?

Clone this wiki locally