Minsky is an IBM S822LC server with dual power8+ chips, 10 cores per socket, 8 SMT (Simultaneous MultiThreading) per core. 4 NVIDIA Pascal P100 GPUs are connected with NVlinks. SSD is equipped as /scratch storage to provide 700GB usable space.
$ lscpu: Architecture: ppc64le Byte Order: Little Endian CPU(s): 160 On-line CPU(s) list: 0-159 Thread(s) per core: 8 Core(s) per socket: 10 Socket(s): 2 NUMA node(s): 2 Model: 1.0 (pvr 004c 0100) Model name: POWER8NVL (raw), altivec supported CPU max MHz: 4023.0000 CPU min MHz: 2061.0000 L1d cache: 64K L1i cache: 32K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-79 NUMA node1 CPU(s): 80-159 $ nvidia-smi topo -m GPU0 GPU1 GPU2 GPU3 mlx5_0 mlx5_1 mlx5_2 mlx5_3 CPU Affinity GPU0 X NV2 SOC SOC SOC SOC SOC SOC 0-79 GPU1 NV2 X SOC SOC SOC SOC SOC SOC 0-79 GPU2 SOC SOC X NV2 SOC SOC SOC SOC 80-159 GPU3 SOC SOC NV2 X SOC SOC SOC SOC 80-159 mlx5_0 SOC SOC SOC SOC X PIX SOC SOC mlx5_1 SOC SOC SOC SOC PIX X SOC SOC mlx5_2 SOC SOC SOC SOC SOC SOC X PIX mlx5_3 SOC SOC SOC SOC SOC SOC PIX X Legend: X = Self SOC = Connection traversing PCIe as well as the SMP link between CPU sockets(e.g. QPI) PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge) PIX = Connection traversing a single PCIe switch NV# = Connection traversing a bonded set of # NVLinks
IBM Advance Toolchain
The IBM Advance Toolchain for Linux on Power is a set of open source development tools (compiler, debugger and profiling tools) and runtime libraries that allow users to take leading edge advantage of IBM's latest POWER hardware features on Linux. For more information about it, visit http://ibm.co/AdvanceToolchain.
- A new update release for the 10.0 series of the IBM Advance Toolchain for Linux on Power is now installed under /opt/at10.0.
- This release provides many package updates, including:
GCC 6.2 Glibc 2.24 Binutils 2.27 GDB 7.11 Support for Ubuntu 16.04. GCC provides fixes for complex IEEE 128-bit floating point and support for IEEE 128-bit floating point built-ins. GCC creates binaries using --mcpu=power8 --mtune=power8 by default on ppc64le. Valgrind provides a fix for missing support for wbit field on mtfsfi instruction. Valgrind Itrace provides a new option to only start instruction tracing when a given function starts. Cross-compiler packages are now signed. OpenSSL provides 6 security advisories.
- PowerAI release 4.0 is installed and provides software packages for several Deep Learning frameworks, supporting libraries, and tools:
Bazel Caffe - BVLC, IBM, and NVIDIA variants Chainer DIGITS NCCL OpenBLAS TensorFlow Theano Torch
All deep learning frameworks are installed in the folder /opt/DL :
caffe-bvlc - Berkeley Vision and Learning Center (BVLC) upstream Caffe, v1.0.0 caffe-ibm - IBM Optimized version of BVLC Caffe, v1.0.0 caffe-nv - NVIDIA fork of Caffe, v0.15.14 chainer - Chainer, v1.23.0 digits - DIGITS, v5.0.0 tensorflow - Google TensorFlow, v1.1.0 ddl-tensorflow - Distributed Deep Learning custom operator for TensorFlow theano - Theano, v0.9.0 torch - Torch, v7
Login to the system
Minsky is currently being incorporated into the cloud. Updated access instructions will be posted here when that is done.
Getting started with IBM PowerAI MLDL Frameworks
Before running any GPU job, user should check the GPU avaiablity by using command:
then choose an idle GPU by adding CUDA_VISIBLE_DEVICES= before any program. For example:
CUDA_VISIBLE_DEVICES=<gpu_ids> program... CUDA_VISIBLE_DEVICES=3 program... (for single-gpu job) CUDA_VISIBLE_DEVICES=0,1 program... (for multi-gpu job)
- It is highly recommended to copy your input data to local SSD storage: /scratch to get best I/O performance, but still use /home or /work for outputs.
Each framework package provides a shell script to simplify environmental setup. We recommend that users update their shell rc file (e.g. .bashrc) to source the desired setup scripts. For example:
Packages are provided for upstream BVLC Caffe (/opt/DL/caffe-bvlc), IBM optimized BVLC Caffe (/opt/DL/caffe-ibm), and NVIDIA's Caffe (/opt/DL/caffe-nv). The system default Caffe (/opt/DL/caffe) is IBM optimized Caffe. To activate the system default caffe:
Or to activate a specific variant. For example:
- Attempting to activate multiple Caffe packages in a single login session will cause unpredictable behavior.
Once caffe is activated, user can directly run command:
CUDA_VISIBLE_DEVICES=<gpu_id> caffe train --solver=...
To active tensorflow, run the command:
Then user can run python with tensorflow code:
CUDA_VISIBLE_DEVICES=<gpu_ids> python code.py
To active Torch, run the command:
Then user can run th with lua code:
CUDA_VISIBLE_DEVICES=<gpu_ids> th code.lua
To active theano, run the command:
Then user can run python with theano code:
CUDA_VISIBLE_DEVICES=<gpu_ids> python code.py
To active digits, run the command:
source /opt/DL/digits/bin/digits-activate-sn '''(This is a SHARCNET modification, changed job folder from /home to /work which has a lot more space)'''
To start DIGITS server with default port (5000):
CUDA_VISIBLE_DEVICES=<gpu ids> digits-devserver
To start DIGITS server with specific port:
CUDA_VISIBLE_DEVICES=<gpu ids> digits-devserver -p <port_num>
To use DIGITS, user should login to any SHARCNET cluster in another session with X11 window forwarding enabled. (adding -Y when ssh to the cluster, e.g. ssh -Y firstname.lastname@example.org. User should also prepare a web browser on SHARCNET machine. User can download a Firefox from https://www.mozilla.org/en-US/firefox/all/, please choose a LINUX-64bit version. Or copy /work/feimao/software_installs_old/firefox to user's folder. Go to the firefox folder and run ./firefox then open the webpage:
http://minsky.uwo.sharcnet:5000 (or the port specified by user)