SIGN-IN

Cluster graham.sharcnet.ca

Links System documentation in the SHARCNET Help Wiki

Manufacturer Huawei
Operating System CentOS 7
Interconnect EDR + FDR Infiniband
Total processors/cores 33448
Nodes
1‑800
32 cores
2 sockets x 16 cores per socket
Intel E5-2683 v4 (Broadwell) @ 2.1 GHz
Type: Compute
Notes: Base profile compute nodes.
Memory: 128.0 GB
Local storage: 1.2 TB
801‑803
56 cores
4 sockets x 14 cores per socket
Intel E7-4850 v3 (Haswell) @ 2.2 GHz
Type: Compute
Memory: 3072.0 GB
Local storage: 1.2 TB
804‑827
32 cores
2 sockets x 16 cores per socket
Intel E5-2683 v4 (Broadwell) @ 2.1 GHz
Type: Compute
Memory: 512.0 GB
Local storage: 1.2 TB
828‑987
32 cores
2 sockets x 16 cores per socket
Intel E5-2683 v4 (Broadwell) @ 2.1 GHz
Type: Compute
Notes: Accelerated compute nodes with 2 × NVIDIA Pascal P100 GPUs (12GB HBM2)
Memory: 128.0 GB
Local storage: 800 TB
988‑1043
32 cores
2 sockets x 16 cores per socket
Intel E5-2683 v4 (Broadwell) @ 2.1 GHz
Type: Compute
Notes: Cloud configuration
Memory: 256.0 GB
Local storage: 1.2 TB
Total attached storage 14500 TB
Suitable use

Heterogeneous cluster, suitable for a variety of workloads.

Software available

STAR-CCM+, MAP, ANSYS, ESPRESSO, GAUSSIAN, COMSOL, MATLAB, DDT, INTEL, ADF/BAND, AMBER, LSDYNA

Current system state details Graphs

Recent System Notices

Status Status Notes
Oct 02 2018, 02:18PM
(14 days ago)

Starting Tuesday, October 9th, 2018 at 10 p.m. ET, the Graham cluster will be unavailable to all users and running jobs will be terminated. This outage is required due to electrical work being done by the regional utility and will impact half of the Waterloo campus. We will take advantage of this unexpected downtime to perform updates to the cluster to improve the stability, performance and overall security.

The cluster should be reopened Thursday, October 11th.

Sep 28 2018, 08:23AM
(18 days ago)

The patching of the project file system went well and the cluster has been reopened.

For more details: http://status.computecanada.ca/

Sep 26 2018, 12:32PM
(20 days ago)

Starting Thursday, September 27, 2018 at 8 a.m. ET, the Graham cluster will be unavailable to all users and running jobs will be terminated. During this outage we will be patching and upgrading the /project file system. This is required to properly clean up after the recent file system issue and to help prevent a recurrence.

Graham will reopen to users by 8am Friday the 28th.

For more details: http://status.computecanada.ca/

Sep 07 2018, 03:05PM
(about 1 month ago)

Graham returned to service on Sept 5, with a problem that affects some files on the /project filesystem. We expect to have all the affected files restored to normal in about a week.

For more details: http://status.computecanada.ca/

Aug 02 2018, 01:02PM
(2 months ago)

Starting Tuesday, August 21, 2018 at 7 a.m. ET, the Graham cluster will be unavailable to all users and running jobs will be terminated. The power feed to the building is being upgraded, which requires a complete shutdown. This is a major project that will take about eight (8) days to complete. The Graham cluster, all storage, Globus, cloud, TSM backup services and all associated hardware will be unavailable to users during the outage.

Users are encouraged to migrate to Cedar before the outage and to shutdown their running virtual machines in Graham Cloud before the downtime. If you have used Niagara in the past, you can also consider this cluster as an alternative.

Please watch http://status.computecanada.ca for updates on the availability of Graham and all other national systems.

Start Date : Tuesday, August 21, 2018 Start Time : 7:00 a.m. ET Anticipated End Date : Wednesday, August 29, 2018

Users will be notified by email when the cluster is up and running again.

The duration of this outage is subject to change; updates will be posted to http://status.computecanada.ca and via the Compute Canada, Compute Ontario, and SHARCNET Twitter accounts. If the outage is extended by more than two days, another user communication will be sent directly.

For questions, or assistance migrating to other national systems, please email support@computecanada.ca

Sign-in to get full status history