From Documentation
Jump to: navigation, search
 
(14 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[File:Graham1.png|300px]]  [[File:Graham3.png|300px]]  [[File:Graham2.png|250px]]
 
[[File:Graham1.png|300px]]  [[File:Graham3.png|300px]]  [[File:Graham2.png|250px]]
  
Graham is targeted to be available for users starting the week of June 19th, 2017:
+
'''IMPORTANT''': to login to Graham, you have to use your Compute Canada credentials (login name and password), not your SHARCNET credentials!
+
 
 
Graham is the largest and by far the most powerful cluster among the current SHARCNET fleet of supercomputers. Graham is also known as GP3, and is a part of the major renewal of academic supercomputers in Canada in 2017, with the other new systems being Arbutus (GP1) in the University of Victoria, Cedar (GP2) in Simon Fraser University, and Niagara (LP) in the University of Toronto. A SHARCNET system notice will be sent to all users when Graham is ready for access. SHARCNET users will be able to login to this system at graham.computecanada.ca with their Compute Canada username and password. In the meantime several resources have been put into place for users to familiarise themselves with the usage of this new system.
 
Graham is the largest and by far the most powerful cluster among the current SHARCNET fleet of supercomputers. Graham is also known as GP3, and is a part of the major renewal of academic supercomputers in Canada in 2017, with the other new systems being Arbutus (GP1) in the University of Victoria, Cedar (GP2) in Simon Fraser University, and Niagara (LP) in the University of Toronto. A SHARCNET system notice will be sent to all users when Graham is ready for access. SHARCNET users will be able to login to this system at graham.computecanada.ca with their Compute Canada username and password. In the meantime several resources have been put into place for users to familiarise themselves with the usage of this new system.
  
 
General information about migrating work from existing systems to the new national general purpose systems is available on the Compute Canada Wiki page at:
 
General information about migrating work from existing systems to the new national general purpose systems is available on the Compute Canada Wiki page at:
 +
 
https://docs.computecanada.ca/wiki/Code_and_job_migration_from_legacy_systems
 
https://docs.computecanada.ca/wiki/Code_and_job_migration_from_legacy_systems
 
   
 
   
 
Properties of the system including its address, node composition and file systems, etc. can be found on the Compute Canada Wiki page at:
 
Properties of the system including its address, node composition and file systems, etc. can be found on the Compute Canada Wiki page at:
 +
 
https://docs.computecanada.ca/wiki/Graham
 
https://docs.computecanada.ca/wiki/Graham
 
   
 
   
 
Instructions for running jobs via the Slurm scheduler are available on the Compute Canada Wiki page at:
 
Instructions for running jobs via the Slurm scheduler are available on the Compute Canada Wiki page at:
 +
 
https://docs.computecanada.ca/wiki/Running_jobs
 
https://docs.computecanada.ca/wiki/Running_jobs
 +
 +
List of software packages available on the system on the Compute Canada Wiki page at:
 +
 +
https://docs.computecanada.ca/wiki/Available_software
 
   
 
   
 
A recent SHARCNET General Interest Webinar describes what to expect from the new systems and demonstrates some important usage differences from other SHARCNET systems. A recording of this webinar is available at the SHARCNET YouTube channel:
 
A recent SHARCNET General Interest Webinar describes what to expect from the new systems and demonstrates some important usage differences from other SHARCNET systems. A recording of this webinar is available at the SHARCNET YouTube channel:
 +
 
https://www.youtube.com/watch?v=VYaLlQ4Q8pI
 
https://www.youtube.com/watch?v=VYaLlQ4Q8pI
 
   
 
   
 
Short introductory video recordings covering different aspects of the new national general purpose clusters are available as a playlist at the Compute Canada YouTube channel:
 
Short introductory video recordings covering different aspects of the new national general purpose clusters are available as a playlist at the Compute Canada YouTube channel:
 +
 
https://www.youtube.com/channel/UC2f3cwviToj-mazutBNhzFw
 
https://www.youtube.com/channel/UC2f3cwviToj-mazutBNhzFw
 
   
 
   
During the week of June 19th, everyday at 1pm EDT a SHARCNET staff member will be presenting a demonstration of basic workflow on Graham in the SN-Seminars Vidyo room at:
+
Once that the Graham system is available SHARCNET staff will be presenting daily demonstrations of basic work flow on Graham. Following a brief usage demonstration the support staff will stay online for the remainder of the hour to discuss access and usage topics relating to the Graham system. These live demonstrations/discussions will be posted with other SHARCNET events on the calendar at:
https://vidyo.computecanada.ca/flex.html?roomdirect.html&key=Pr1GiEI51kFi
+
 
Following this brief 10-15 minute demonstration the support staff will stay online (until 2pm or as required) to discuss access and usage topics relating to the Graham system.
+
https://www.sharcnet.ca/my/news/calendar
 +
 
 
   
 
   
 
For support request relating to the Graham system email support@computecanada.ca or help@sharcnet.ca .
 
For support request relating to the Graham system email support@computecanada.ca or help@sharcnet.ca .
Line 28: Line 38:
  
 
== Quick facts ==
 
== Quick facts ==
* Number of CPU cores: 32,168
+
* Number of CPU cores: 33448
 
* Number of nodes: 1043
 
* Number of nodes: 1043
 
* Total memory (RAM): 149 TB (4.6 GB/core on average)
 
* Total memory (RAM): 149 TB (4.6 GB/core on average)
 
* Number of NVIDIA P100 GPUs: 320
 
* Number of NVIDIA P100 GPUs: 320
 
* Networking: EDR (cpu nodes) and FDR (GPU nodes) InfiniBand
 
* Networking: EDR (cpu nodes) and FDR (GPU nodes) InfiniBand
 +
 +
== Default account ==
 +
 +
Every Graham user gets at least one - default - account on the Graham's scheduler - def-userid (each user has a different userid). Users with RAC allocation(s) also get additional account(s). Each job script (and each salloc command) has to include the intended account name in the "-A account" argument.
 +
 +
To make life easier, you can add the following lines at the end of your .bashrc file:
 +
 +
export SLURM_ACCOUNT=def-$USER
 +
export SBATCH_ACCOUNT=$SLURM_ACCOUNT
 +
export SALLOC_ACCOUNT=$SLURM_ACCOUNT
 +
 +
After that you have to log out and log in. From then on you will only have to use the "-A account" argument in your job scripts when you use a non-default account (say, a RAC account).
 +
 +
== Test jobs ==
 +
 +
Graham has a small number of interactive nodes reserved for short (<12 hours) jobs. They should only be used for testing / debugging your code.
 +
 +
To access these nodes, use the following commands (instead of sbatch) when submitting the job:
 +
 +
$ srun -t 0:10:0 -n 1 -A account -o out.log ./serial_code &
 +
$ srun -t 0:10:0 -n 8 -A account -o out.log ./mpi_code &
 +
$ OMP_NUM_THREADS=8 srun -t 0:10:0 -c 8 -A account -o out.log ./multithreaded_code &
 +
 +
You can add other srun (sbatch) arguments, like --mem-per-cpu etc.
  
 
== Useful links ==
 
== Useful links ==
Line 39: Line 73:
 
* [https://docs.computecanada.ca/wiki/Graham Compute Canada page on Graham]
 
* [https://docs.computecanada.ca/wiki/Graham Compute Canada page on Graham]
 
* Top500 article [https://www.top500.org/news/canada-is-quietly-adding-10-petaflops-to-its-network-of-academic-supercomputers Canada Is Quietly Adding 10 Petaflops to Its Network of Academic Supercomputers]
 
* Top500 article [https://www.top500.org/news/canada-is-quietly-adding-10-petaflops-to-its-network-of-academic-supercomputers Canada Is Quietly Adding 10 Petaflops to Its Network of Academic Supercomputers]
 +
 +
 +
[[Category:Systems]]

Latest revision as of 13:42, 11 September 2017

Graham1.png Graham3.png Graham2.png

IMPORTANT: to login to Graham, you have to use your Compute Canada credentials (login name and password), not your SHARCNET credentials!

Graham is the largest and by far the most powerful cluster among the current SHARCNET fleet of supercomputers. Graham is also known as GP3, and is a part of the major renewal of academic supercomputers in Canada in 2017, with the other new systems being Arbutus (GP1) in the University of Victoria, Cedar (GP2) in Simon Fraser University, and Niagara (LP) in the University of Toronto. A SHARCNET system notice will be sent to all users when Graham is ready for access. SHARCNET users will be able to login to this system at graham.computecanada.ca with their Compute Canada username and password. In the meantime several resources have been put into place for users to familiarise themselves with the usage of this new system.

General information about migrating work from existing systems to the new national general purpose systems is available on the Compute Canada Wiki page at:

https://docs.computecanada.ca/wiki/Code_and_job_migration_from_legacy_systems

Properties of the system including its address, node composition and file systems, etc. can be found on the Compute Canada Wiki page at:

https://docs.computecanada.ca/wiki/Graham

Instructions for running jobs via the Slurm scheduler are available on the Compute Canada Wiki page at:

https://docs.computecanada.ca/wiki/Running_jobs

List of software packages available on the system on the Compute Canada Wiki page at:

https://docs.computecanada.ca/wiki/Available_software

A recent SHARCNET General Interest Webinar describes what to expect from the new systems and demonstrates some important usage differences from other SHARCNET systems. A recording of this webinar is available at the SHARCNET YouTube channel:

https://www.youtube.com/watch?v=VYaLlQ4Q8pI

Short introductory video recordings covering different aspects of the new national general purpose clusters are available as a playlist at the Compute Canada YouTube channel:

https://www.youtube.com/channel/UC2f3cwviToj-mazutBNhzFw

Once that the Graham system is available SHARCNET staff will be presenting daily demonstrations of basic work flow on Graham. Following a brief usage demonstration the support staff will stay online for the remainder of the hour to discuss access and usage topics relating to the Graham system. These live demonstrations/discussions will be posted with other SHARCNET events on the calendar at:

https://www.sharcnet.ca/my/news/calendar


For support request relating to the Graham system email support@computecanada.ca or help@sharcnet.ca .


Quick facts

  • Number of CPU cores: 33448
  • Number of nodes: 1043
  • Total memory (RAM): 149 TB (4.6 GB/core on average)
  • Number of NVIDIA P100 GPUs: 320
  • Networking: EDR (cpu nodes) and FDR (GPU nodes) InfiniBand

Default account

Every Graham user gets at least one - default - account on the Graham's scheduler - def-userid (each user has a different userid). Users with RAC allocation(s) also get additional account(s). Each job script (and each salloc command) has to include the intended account name in the "-A account" argument.

To make life easier, you can add the following lines at the end of your .bashrc file:

export SLURM_ACCOUNT=def-$USER
export SBATCH_ACCOUNT=$SLURM_ACCOUNT
export SALLOC_ACCOUNT=$SLURM_ACCOUNT

After that you have to log out and log in. From then on you will only have to use the "-A account" argument in your job scripts when you use a non-default account (say, a RAC account).

Test jobs

Graham has a small number of interactive nodes reserved for short (<12 hours) jobs. They should only be used for testing / debugging your code.

To access these nodes, use the following commands (instead of sbatch) when submitting the job:

$ srun -t 0:10:0 -n 1 -A account -o out.log ./serial_code &
$ srun -t 0:10:0 -n 8 -A account -o out.log ./mpi_code &
$ OMP_NUM_THREADS=8 srun -t 0:10:0 -c 8 -A account -o out.log ./multithreaded_code &

You can add other srun (sbatch) arguments, like --mem-per-cpu etc.

Useful links