The Statlab Computational Server (statlab.princeton.edu) is comprised of multiple machines (nodes), consisting of a control or head node, two small computational nodes (node002–node003), two hybrid CPU/GPU systems for larger computation (node004–node005), and a total 18TB of disk storage. Details
The two small computational nodes (node002–node003) each have 2 x Intel Xeon E5-2687W v3 CPUs, 48GB RAM, and 20 Cores.
The two larger GPU/CPU systems are unique. The first (node004) has 2 x Intel Xeon E5-2637 V4, 64GB RAM and 8 Cores with 2 x Nvidia Titan V GPUs. The second (node005) has an Intel Xeon W-2155 with 256GB RAM and 10 Cores as well as 2 x NVIDIA Quadro RTX 6000 PCI-E 24GB GPUs.
Transfer your code, dataset, and any other supporting files to the remote system.
- Download and install WinSCP.[^1]
- Open WinSCP and type the hostname of the remote system into the hostname field and provide your credentials in the username and password fields.
- Click the Save button to save the hostname and credentials as a Site, an optional step if you do not wish to reenter the information each time you launch WinSCP.
- Click the Login button to connect to the remote system. If this is the first time you are connecting to the system from a given machine, you will see a warning about accepting an unknown server's host key. The host key uniquely identifies the server and is cached to negotiate the security of subsequent connections.[^2]
- Transfer files to and from the remote system. The file list on the left represents your local files while the remote system's files are on the right.
- Download and install Fetch.[^3]
- Open Fetch and type the hostname of the remote system into the hostname field and provide your credentials in the username and password fields.
- Click the Heart button and Make Shortcut to save the hostname and credentials as a shortcut, an optional step if you do not wish to reenter the information each time you launch Fetch.
- Click the Connect button to connect to the remote system.
- Transfer files to and from the remote system by dragging files from the Finder to and from the Fetch window.
Prepare a Job
Provided you have transferred your code, dataset, and any other supporting files, you can submit execution of your code as a job using a SLURM submission script. SLURM is a high performance computing job management system that allows for the the execution and control of your run across multiple machines (nodes). Create a text file named
job.slurm containing the following example script, transfer it to the server along with your code, and change the last line to reflect the command to execute your code. Note that you should change the email address in the example script so that you receive an email message when your job begins and ends.
Example Submission Script
#!/bin/bash # This line is ignored as a comment because it begins with a # followed by a space. # The lines that begin #SBATCH are interpreted as directives for the queueing system. # Number of nodes requested. #SBATCH -N 1 # Number of processors requested. #SBATCH --ntasks-per-node=1 # Maximum execution time. #SBATCH -t 8:00:00 # Request an email when the job begins. #SBATCH --mail-type=begin # Request an email when the job ends. #SBATCH --mail-type=end # Send email to the address indicated. #SBATCH --firstname.lastname@example.org # The following line is the command to execute. echo "This is a test." > ~/test-`date +%s`
Once your code, dataset, and submission script are transferred to the remote system, you will need to issue commands to the remote system. At a minimum, you will need to issue a command to submit your job to the job queue so that the code will execute. Issuing commands requires a connection to the remote system via SSH. From the terminal software specific to your platform, the following command will connect you to statlab via SSH.
Test whether or not you successfully connected to the Statlab server by issuing the following command. It should return the server's address of
statlab.princeton.edu if you were successful.
Submit a Job
Once you have established an SSH connection to the Statlab server you can issue the following command to submit your job:
The jobs you have submitted to the queue can be listed with the following command:
squeue -u $USER
Examine the Results
Every job receives a job number, which is reported immediately after submission to the queue and also indicated as part of a the
slurm.out file, a file generated for every job submission and located in your file-space on the Statlab server. The
slurm.out file contains any job errors or standard output not explicitly directed to a custom output file. Examine this file for any errors in code execution.