Transparent_banner
home || sign-in || register ||

Cluster kraken.sharcnet.ca

Due to scheduled power outages at multiple SHARCNET institutions, several clusters (angel, copper, guppy, mako, mosaic, orca, redfin and saw) as well as global /work are currently unavailable and will return to service no later than 1 pm. Monday May 2.
We will use this outage to perform some much needed maintenance on a number of critical systems.

Other clusters will remain up but any jobs that require access to /work will crash. Users should only submit jobs requiring access to /home and /scratch
Links System documentation in the SHARCNET Help Wiki

Manufacturer HP
Operating System CentOS 5.4
Interconnect Myrinet 2g (gm)
Total processors/cores 1968
Nodes
narwhal: 1‑267
4 cores
2 sockets x 2 cores per socket
AMD Opteron @ 2.2 GHz
Type: Compute
Notes: Compute nodes.
Memory: 8.0 GB
Local storage: 30 GB
bull: 1‑96
4 cores
4 sockets x 1 core per socket
AMD Opteron @ 2.4 GHz
Type: Compute
Notes: N/A
Memory: 32.0 GB
Local storage: 150 GB
bull: 128‑159
4 cores
2 sockets x 2 cores per socket
AMD Opteron @ 2.2 GHz
Type: Compute
Notes: N/A
Memory: 8.0 GB
Local storage: 80 GB
bull: 301‑396
4 cores
2 sockets x 2 cores per socket
AMD Opteron @ 2.2 GHz
Type: Compute
Notes: N/A
Memory: 8.0 GB
Local storage: 80 GB
kraken: 240
2 cores
AMD Opteron @ 2.2 GHz
Type: Admin
Memory: 8.0 GB
Local storage: 80 GB
kraken: 241
2 cores
AMD Opteron @ 2.2 GHz
Type: Login
Memory: 4.0 GB
Local storage: 160 GB
Total attached storage 2.73 TB
Suitable use

Throughput clusters, an amalgamation of older point-of-presence and throughput clusters, suitable for serial applications, and small-scale low latency demanding parallel MPI applications.

Software available

MAPLE, MrBAYES, COREUTILS, CP2K, MATLAB, SIESTA, CMAKE, PYTHON, NETCDF, IMSL, GEANT4, GIT, OPENJDK, UTIL, DDT, R, MPFUN90, PARI/GP, OPENMPI, BLAST, OCTAVE, SPRNG, HDF, ACML, GCC, MERCURIAL, DAR, SUBVERSION, PERL, INTEL, PETSC_SLEPC, AMBER, ORCA, GNUPLOT, CDF, SAMTOOLS, GMP, MKL, FFTW, BIOPERL, GNU , BOOST, ARPACK-NG, OPEN64, BIOSAMTOOLS, MPFUN2015, QD, VIM, TEXLIVE, RLWRAP, VALGRIND, YT, FDTD, CHARM++, MPFR, SUPERLU, PNETCDF, IPM, GROMACS, GSL, BINUTILS, MPC, BIOPERLRUN, SQ, ILOGCPLEX, PGI, OPENCV, LDWRAPPER, CPAN, RUBY, NIX, PROOT, ECLIPSE, GHC, ANSYS

Current system state details Graphs

Recent System Notices

Status Status Notes
Mar 10 2016, 11:28AM
(about 1 month ago)

Temperature has returned to normal so all ‘bul’ nodes are active again and jobs running on them have been resumed.

Mar 10 2016, 09:58AM
(about 1 month ago)

All ‘bul’ nodes from bul1 through bul159 are inactive and jobs running on them have been temporarily suspended to reduce heat to allow cooling system maintenance. We’re hopeful that we won’t have to shut down the nodes and lose the running jobs.

Mar 01 2016, 09:38AM
(about 1 month ago)

There will be a network outage on kraken and monk tomorrow morning (March 2) at around 7:30 am. for less than 2 minutes in order to move to a new network port.

During the outage any live login sessions and jobs accessing global filesystems may hang but they will recover afterwards and continue normally. No job or data losses are expected.

Jan 27 2016, 10:45AM
(3 months ago)

Cluster is mostly recovered after the power outage last night and is usable again.

Jan 27 2016, 07:19AM
(3 months ago)

A power outage at Western has taken Goblin, Monk and portions of Kraken offline. We are working to restore these services.

Sign-in to get full status history