Sockeye Introduction

University of British Columbia

Tony Liang

September 27, 2023

Land acknowledgement

I would like to acknowledge that I work on the traditional, ancestral, and unceded territory of the Coast Salish Peoples, including the territories of the xwməθkwəy̓əm (Musqueam), Skwxwú7mesh (Squamish), Stó:lō and Səl̓ílwətaʔ/Selilwitulh (Tsleil- Waututh) Nations.

Traditional: Traditionally used and/or occupied by Musqueam people

Ancestral: Recognizes land that is handed down from generation to generation

Unceded: Refers to land that was not turned over to the Crown (government) by a treaty or other agreement

Setup VSCode

Checkout the instruction of your right platform below:

Open this link to download, once it is finished:

  1. Open the downloaded VSCode-darwin-universal.zip file.
  2. Move the extracted VS Code app from Downloads to Applications. (Located to your left under Favorites)
  3. Launch VS Code and open the Command Palette by Cmd + Shift + P.
  4. Type shell command.
  5. Select the option: Install code command in PATH.
  6. Restart your terminal and try entering: code --version to confirm successful installation.

Open this link to download and run it:

On Select Additional Tasks:

  • Add ‘Open with Code action to Windows file context menu’.
  • Add ‘Open with Code’ action to Windows directory context menu”.
  • Register Code as an editor for supported file types
  • Add to PATH (should be selected by default).

Connecting to Sockeye

Note

You need to have a UBC IP address in order to access to the server by connecting to UBC WIFI or VPN1

Login to sockeye with the ssh command:

ssh cwl@sockeye.arc.ubc.ca
  • cwl is your campus wide login.

  • password is your same CWL password.

  • sockeye.arc.ubc.ca is the server address.

  • you also need Two-Factor Authentication 2FA.

Alternative way connecting to Sockeye

  1. Generate new ssh key pair credentials in your local.
  2. Add your server login info into ~/.ssh/config in your local.
  3. Add the public key to the server-side.
  4. Login with ssh <new_alias>

Generate a new ssh key pair

ssh-keygen -t rsa
  • Follow default by prompting Enter
  • Passphrase optional
  • Two keys generated:
    • one public (id_rsa.pub)
    • one private (id_rsa)

Caution

Keep the private key safe and for your use only

Alternative way connecting to Sockeye

Use any editor to create ~/.ssh/config and enter the following:

# Note these indent matters
Host  arc # or another alias you like
  HostName sockeye.arc.ubc.ca
  User cwl # replace your cwl


  • Host = User@Hostname
  • Multiple configuration (multiple servers)
  • More options available , look at here

Add your public key to the server

# arc is from our config
ssh-copy-id -i ~/.ssh/id_rsa.pub arc # or cwl@sockeye.arc.ubc.ca
# Need to replace cwl!
type $env:USERPROFILE\.ssh\id_rsa.pub | ssh cwl@sockeye.arc.ubc.ca "cat >> .ssh/authorized_keys"

Sockeye

# If you followed earlier configuration correctly
ssh arc # otherwise ssh cwl@sockeye.arc.ubc.ca
  • Upon login, you land on the login node and spend most time here
  • Use it to run large computational works
  • Whenever you’re ready, you submit batch jobs to server
  • You can log out and leave a job running
  • Runs with PBS, changing to SLURM1 soon

Tip

We also have a recorded session from past months, if you’re interested please contact Amrit

Directories on Sockeye

Tip

Remember that read/write access matters! You clearly don’t want to spend hours fitting models, but failed to write any results, cause you forgot to specify it to be the scratch directory.

VScode in Sockeye

Using direct text editors like nano or vi/vim could be hard (steep learning curve)

So, use VSCode instead!


  1. Open local terminal and prompt code

  2. Press Ctrl + Shift + X

  3. Type Remote - SSH in search bar

  4. Open first option and click install

Accessing softwares

  • Once you connect with ssh:
  • There are no applications loaded
  • You must tell system what you need
  • Limited options, if need more try container or conda

Useful commands

module load bin1/x.y.z bin2/x.y.z # load two modules and their versions
module purge # unload all modules
module list # what you have loaded
module save name_col # save loaded modules to "<name_col>"
module savelist # list saved collections (<name col> this case)
module restore name_col # load saved collection
module avail # show all available modules
module spider mod_name # check available `<mod_name>` in sockeye

Example Job Script

Note

There are no internet access during any of the jobs below:

#!/bin/bash
#PBS -l walltime=00:05:00,select=1:ncpus=2:mem=2gb
#PBS -N sample_job
#PBS -A st-singha53-1
#PBS -m abe
#PBS -M tliang19@student.ubc.ca
#PBS -o my_output.txt
#PBS -e my_error.txt

cd ${PBS_O_WORKDIR}
python hello_world.py
  • This asks for 5 min of compute time with 2 cpus and 2GB of RAM
  • You would expect much larger resources in complex scenarios
  • Also, modules/environments/shell codes
  • Most kinds of jobs falls to this category
#!/bin/bash
#PBS -l walltime=12:00:00,select=1:ncpus=4:ngpus=2:mem=64gb
#PBS -N sample_job_gpu
#PBS -A st-singha53-1-gpu
#PBS -M tliang19@student.ubc.ca
#PBS -o my_output.txt
#PBS -e my_error.txt

# load modules
module load gcc apptainer cuda

cd ${PBS_O_WORKDIR}
python hello_world.py
  • This asks for 12 hours of comp. time, 1 node with 4 cpus and 2 gpus, AND 64GB RAM
  • Need to specify number of gpus at ngpus
  • Only allows 32G or 64G of memory for GPU
  • Special allocation code, with st-alloc-gpu
  • Probably runs this when your works uses neural networks
#!/bin/bash
#PBS -l walltime=3:00:00,select=1:ncpus=4:mem=16gb
#PBS -I
#PBS -q interactive_cpu
#PBS -N sample_job
#PBS -A st-singha53-1

# Note here should be empty, since you will get back a bash shell 
# with the requested resources in the script.
# And this ones are recommended to use, 
# when you have simple experiments to conduct without GPU
  • This ask for a interactive node (shell) of 3 hours , with 4 cpus , with 16GB of RAM
  • The #PBS -I and #PBS -q interactive_cpu are required to make a job interactive with CPU
  • Otherwise job will be treated as a standard batch job
  • Use this to test small computations or functionalitys of libraries you aim to use
#!/bin/bash
#PBS -l walltime=3:00:00,select=1:ncpus=4:ngpus=1:mem=16gb
#PBS -I
#PBS -q interactive_gpu
#PBS -N sample_job
#PBS -A st-singha53-1

# Note here should be empty, since you will get back a bash shell 
# with the requested resources in the script.
# And this ones are recommended to use, when you have simple experiments
# to conduct that uses GPU.
  • This ask for a interactive node (shell) of 3 hours , with 4 cpus and 1 gpu , with 16GB of RAM
  • Same as the interactive job with cpu but with GPU support
  • Don’t need the special code st-alloc-gpu
  • Barely used it before, but Sockeye has this available

Submitting and other useful commands

  • Suppose our pbs script is saved as job_script.sh
qsub job_script.sh # to submit job

qstat -u $USER # $USER is a pre-defined ENV variable

qdel some_job_id # some_job_id is from after submmision

qstat -x some_job_id # see resource comsumption of job

Important

  1. On Sockeye, jobs must be run from ~/scratch/ only.
  2. There are no internet access on the compute nodes (launched jobs).

Summary workflow

  1. Login via ssh
  2. Modify your job script
    • required modules
    • required environment
    • required resources
  3. Submit job
  4. Inspect results
  5. Pray and repeat until success!

Tip

We are building our lab handbook with known FAQs for debugging on Sockeye, check out here

Thanks!