SIGN-IN

Cluster monk.sharcnet.ca

Links System documentation in the SHARCNET Help Wiki

Manufacturer IBM
Operating System CentOS
Interconnect QDR Infiniband
GPU 2x M2070 per node
Total processors/cores 432
Nodes
monk: 1‑54
8 cores
2 sockets x 4 cores per socket
Intel E5607 @ 2.26 GHz
Type: Compute
Memory: 48.0 GB
Local storage: 0 Bytes
Total attached storage 66 TB
Suitable use

Parallel applications with GPU acceleration

Software available

CUDA, PGI, GCC, FFTW, SIESTA, PETSC_SLEPC, PYTHON, UTIL, CMAKE, R, DAR, INTEL, OPENMPI, PARI/GP, ACML, OCTAVE, NETCDF, BOOST, MERCURIAL, SUBVERSION, HDF, ORCA, NAMD, OPENJDK, SAMTOOLS, CDF, GNU , GMP, NCL, BIOPERL, TINKER, NCBIC++TOOLKIT, OPEN64, QD, GSL, GDB, MrBAYES, BINUTILS, PERL, VIM, SPRNG, MKL, BIOSAMTOOLS, MPFR, VALGRIND, MPFUN2015, MPFUN90, TEXLIVE, MPC, PNETCDF, CHARM++, YT, GNUPLOT, IPM, BIOPERLRUN, SQ, COREUTILS, LLVM, OPENCV, LDWRAPPER, MONO, ARPACK-NG, EMACS, CPAN, RUBY, NIX, PROOT, GHC, VMD, SYSTEM, AUTODOCKVINA, AMBER, GROMACS, GEANT4, GIT, BLAST, NINJA

Current system state details Graphs

Recent System Notices

Status Status Notes
Feb 05 2019, 12:05PM
(13 days ago)

One of the legacy global filesystems will be migrated to new hardware on Wednesday February 20th. To complete this we must unmount the filesystem from all clusters and prevent jobs from running during the outage.

All legacy clusters will be configured to avoid running any jobs after 3pm on February 19.

We expect all legacy clusters to return to service the following day at 10am.

This outage does not affect Graham or Orca.

Jul 09 2018, 04:01PM
(7 months ago)

Cluster is back up after a failure of the cooling system was resolved.

Jul 05 2018, 02:58PM
(8 months ago)

Cluster is down due to a failure of the cooling system. Maintenance technicians have advised that the cooling system cannot handle the recent very hot weather and have requested that we keep the system shut down until Monday July 9.

Jul 05 2018, 02:35PM
(8 months ago)

dusky, goblin and monk are down after brief thunderstorm-induced power outages. We’re waiting until the storm passes before restarting, should just be a few minutes.

Jul 03 2018, 10:24AM
(8 months ago)

Cluster is down due to a failure of the cooling system. Maintenance technicians have advised that the cooling system cannot handle the recent very hot weather and have requested that we keep the system shut down until Monday July 9.

Sign-in to get full status history