Transparent_banner
home || sign-in || register ||

Cluster saw.sharcnet.ca

Links System documentation in the SHARCNET Help Wiki

Manufacturer HP
Operating System CentOS 6.x
Interconnect DDR InfiniBand
Total processors/cores 2712
Nodes
saw: 1‑336
8 cores
2 sockets x 4 cores per socket
Intel Xeon @ 2.83 GHz
Type: Compute
Notes: Compute nodes
Memory: 16.0 GB
Local storage: None
saw: 8001
8 cores
2 sockets x 4 cores per socket
Intel Xeon @ 2.83 GHz
Type: Admin
Notes: Admin node
Memory: 16.0 GB
Local storage: None
saw: 9001‑9002
8 cores
2 sockets x 4 cores per socket
Intel Xeon @ 2.83 GHz
Type: Login
Notes: Login nodes
Memory: 16.0 GB
Local storage: None
Total attached storage 127 TB
Suitable use

Parallel applications.

Software available

MATLAB, GAUSSIAN, LSDYNA, OPENJDK, CHARM++, CP2K, ACML, ECLIPSE, FREEFEM++, UTIL, ABAQUS, INTEL, SIESTA, ADF/BAND, R, LAMMPS, PYTHON, PARI/GP, CONVERGE, OCTAVE, NETCDF, MERCURIAL, OPENMPI, BLAST, ABINIT, NWCHEM, DAR, MPFUN90, GCC, MKL, OPEN64, PETSC_SLEPC, ORCA, SAMTOOLS, CMAKE, GIT, GNU , FFTW, BOOST, ESPRESSO, CDF, QD, GEANT4, TINKER, ARPACK-NG, ANSYS, SUBVERSION, STAR-CCM+, HDF, GROMACS, MrBAYES, GMP, BINUTILS, MPC, BIOPERL, CPMD, PERL, SPRNG, MPFR, VIM, AMBER, VALGRIND, BIOSAMTOOLS, RLWRAP, MPFUN2015, TEXLIVE, YT, DLPOLY, PROOT, COREUTILS, NAMD, SUPERLU, PNETCDF, FDTD, GNUPLOT, GSL, MPIBLAST, SQ, BIOPERLRUN, ILOGCPLEX, IPM, PGI, OPENCV, LDWRAPPER, CPAN, RUBY, NIX, GHC, VMD

Current system state details Graphs

Recent System Notices

Status Status Notes
Jan 25 2016, 11:40AM
(17 days ago)

The saw cluster is back online after a hardware failure on it’s administrative node.

Due to the nature of the hardware failure, some jobs may have been killed, and will need to be restarted.

The system is currently listed as “online, but may be unstable” due to continuing issues with the login node.

Jan 23 2016, 08:48AM
(19 days ago)

The saw cluster is presently unavailable due to a hardware failure on it’s administrative node.

We are currently working to correct this problem, and will post an update when it is complete.

Jan 22 2016, 03:44PM
(20 days ago)

The hardware failure on the Saw scratch filesystem, which is also mounted by the Angel and Mosaic clusters, has been repaired, and the scratch filesystem is now mounting properly again on the Saw cluster.

The Saw cluster is now returning to “Testing” status for a few more days as we continue to work on possible instability of the login node. Note that this instability will not affect running or scheduled jobs, and as stated before, if the login node is offline, saw can still be accessed to submit and monitor jobs by logging into another Sharcnet cluster and using “ssh saw-dev1.saw” on the command line.

Jan 22 2016, 01:47PM
(20 days ago)

Due to a hardware failure, the Saw scratch filesystem, which is also mounted by the Angel and Mosaic clusters, is unavailable. We are currently working to correct the problem, and will post an update when the problem is resolved.

Jan 08 2016, 02:09PM
(about 1 month ago)

Due to the instability of the Saw cluster’s login nodes the past week, the saw cluster is being placed in “testing” status.

If the login node is down, the saw cluster can still be accessed by the following method:

First, connect to another sharcnet cluster such as Orca via ssh.

Next, use the following command:

ssh saw-dev1.saw

This will connect you to one of saw’s development nodes, which can be used to submit and monitor jobs running on the cluster.

Saw-scratch can be accessed from our Data Transfer Node, dtn.sharcnet.ca at /scratch/saw/scratch/USERID/ (where USERID is your sharcnet userid) to upload and download your data.

The issues are only affecting the login nodes to Saw, and are not affecting any queued or running jobs. We apologize for any inconvenience these issues may have caused.

Sign-in to get full status history