This page is intended to serve as a community and information hub for our machine learning/deep learning and data mining users.
GRAHAM is a heterogeneous cluster, suitable for a variety of workloads, and located at the University of Waterloo. A total of 35,520 cores and 320 GPU devices, spread across 1,107 nodes of different types. GPU nodes have 128 GB of memory, 16 cores/socket, 2 sockets/node, 2 NVIDIA P100 Pascal GPUs/node (12GB HBM2 memory). Intel "Broadwell" CPUs at 2.1Ghz, model E5-2683 v4. 1.6TB NVMe SSD.
Minsky is an IBM S822LC server with dual power8+ chips, 10 cores per socket, 8 SMT (Simultaneous MultiThreading) per core. 4 NVIDIA Pascal P100 GPUs are connected with NVlinks. SSD is equipped as local /tmp storage to provide 700GB usable space.
Copper is a contributed cluster that has 8 GPU nodes with 4 K80 cards (8 GK210 devices) each node.
Mosaic is a contributed cluster that has 20 GPU nodes with 1 K20 card each node.
This section is intended to list software that is being used at SHARCNET, as well as other popular packages that users may wish to consider using for their work. Optimally each package listed below will have it's own wiki page including installation, configuration and execution hints and tips, as well as a listing of different groups at SHARCNET that are experienced with the software.
|Caffe||Caffe is a deep learning framework developed with cleanliness, readability, and speed in mind.||Caffe-wiki|
|Theano||Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.||Theano-wiki|
|BIDMach||BIDMach is an interactive environment designed to make it extremely easy to build and use machine learning models.||BIDMach-wiki|
|DIGITS||The NVIDIA Deep Learning GPU Training System (DIGITS) puts the power of deep learning in the hands of data scientists and researchers||DIGITS-wiki|
|cuDNN||The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks||cuDNN-wiki|
|Torch||Torch is a scientific computing framework with wide support for machine learning algorithms.||Torch-wiki|
|Tensorflow||TensorFlow is an Open Source Software Library for Machine Intelligence.||Tensorflow-wiki|
|IBM PowerAI||The PowerAI platform includes the most popular machine learning frameworks and their dependencies, and it is built for easy and rapid deployment.||Minksy-wiki|
General interest seminars:
- 2017/02/01 - Deep Learning on SHARCNET: Best Practices, Fie Mao, Abstract, slides
- 2016/04/27 - Deep Learning on SHARCNET: Tools you can use, Fei Mao, Abstract, slides
- 2015/02/04 - Deep Learning on SHARCNET: From CPU to GPU cluster, Fei Mao, Abstract, slides
The following staff have backgrounds in machine learning and data mining and may be able to help with domain specific issues. Their contact information can be found in the SHARCNET staff directory.