From Documentation
Revision as of 20:14, 25 May 2018 by Roberpj (Talk | contribs) (Versions)

Jump to: navigation, search
Note: Some of the information on this page is for our legacy systems only. The page is scheduled for an update to make it applicable to Graham.
INTEL
Description: the default compiler on all sharcnet systems
SHARCNET Package information: see INTEL software page in web portal
Full list of SHARCNET supported software


Introduction

The intel compiler module is loaded by default on all sharcnet systems. The intel compiler consists of 5 core components:

o icc -  C Compiler
o icpc - C++ Compiler
o mkl -  Math Kernel Library
o tbb -  Threading Building Blocks
o ipp -  Intel Performance Primitives

Version Selection

Sharcnet provides the following versions of the Intel® Parallel Studio XE Composer Edition for Linux.

Version 2015 Update 3

module load intel/15.0.3

Starting with the intel/15 compiler the mkl module is included as well as tbb and ipp. Note that individual components (flavors) may be loaded and used separately. The intel/icc/15.0.3 module provides the Intel C++ Compiler with relaxed conflict statements to permit mixing with other compilers, likewise for the inte/ifc/15.0.3 module. Other subtle differences between module flavors maybe found by running the diff command on respective module show outputs. The release version of each component maybe found by running the module help command such as module help intel/tbb/15.0.3 Compatibility with earlier intel compilers for each component maybe found in the release notes.

Version 2013 SP1 Update 4

module load intel/14.0.4
module load mkl/11.1.4

Version 2011 Update 9

This is the default compiler on all sharcnets Centos6 production clusters.

module load intel/12.1.3
module load mkl/10.3.9

Version 11.1 Update 5

module load intel/11.1.069

Version 11.0 Update 84

This is the default compiler on sharcnets remaining two Centos5 clusters.

module load intel/11.0.083

Job Submission

sqsub -r 1h -o ofile.%J ./a.out

Example Usage

To demonstrate howto compile a simple program with the intel compiler and submit a job, consider the intel example whereby two matrices are multiplied using the mkl dgemm routine. The ifort linker line is determined by consulting Math Kernel Library Link Line Advisor which uses MKLROOT set automatically by the sharcnet intel/15.0.1 module for convenience:

cp /opt/sharcnet/intel/15.0.1/composerxe/Samples/en_US/mkl/tutorials.zip .
unzip tutorials.zip .
cd tutorials/mkl_mmx_f   [optionally go into mkl_mmx_c to run the c variant]
module unload intel mkl openmpi
module load intel/15.0.1  [or module load intel/ifc/15.0.1 intel/mkl/15.0.1]
ifort dgemm_example.f -L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm
sqsub -r 1h -ofile.%J ./a.out

o Note that as of version 2015 the intel examples are found in the same directory :

[saw-login1: /opt/sharcnet/intel/15.0.1/composerxe/Samples/en_US] ls
C++  Fortran  mkl

o While in earlier versions the intel examples reside in 3 different directories such as :

[saw-login1: /opt/sharcnet/intel/14.0.4/icc/composerxe/Samples/en_US] ls
C++
[saw-login1: /opt/sharcnet/intel/14.0.4/ifc/composerxe/Samples/en_US] ls
Fortran
[saw-login1: /opt/sharcnet/mkl/11.1.4/composerxe/Samples/en_US] ls
mkl

General Notes

Help Categories

Besides using the man pages, help is available from the compilers directly in a categorized format. To print all the available categories one would run the icc -help command as shown below. To then get help on the ipo category for example, one would simply run: icc -help ipo

[roberpj@orc-login2:~] icc -help
Valid categories include:
       advanced        - Advanced Optimizations
       codegen         - Code Generation
       compatibility   - Compatibility
       component       - Component Control
       data            - Data
       deprecated      - Deprecated Options
       diagnostics     - Compiler Diagnostics
       float           - Floating Point
       help            - Help
       inline          - Inlining
       ipo             - Interprocedural Optimization (IPO)
       language        - Language
       link            - Linking/Linker
       misc            - Miscellaneous
       opt             - Optimization
       output          - Output
       pgo             - Profile Guided Optimization (PGO)
       preproc         - Preprocessor
       reports         - Optimization Reports
       openmp          - OpenMP and Parallel Processing

Intel Floating-Point Note

Intel compilers may default to using possibly unsafe optimizations for floating-point operations. Users using the Intel compilers should read the Intel man pages (e.g., man icpc) and are recommended to use one of two options, -fp-model precise or -fp-model source, for ANSI/ISO/IEEE standards-compliant floating-point support. For more details, read this Intel slideshow called, Floating-point control in the Intel compiler and libraries.

Optimization

Before running codes compiled with the intel compiler on non-intel login, compute, or devel nodes one should be aware of Intels Optimization Notice. The main implications of this warning in the sharcnet context are that a program compiled with the default Intel compiler and using default Intel compiler options may run significantly slower on opteron (non-intel) compute nodes. The solution to this problem involves profiling the code to track down and fix inefficient blocks of code AND experimenting with compiler options such as -ipo, -O3 both before running any full scale production jobs. As an example one might do:

module unload intel mkl
module load intel/15.0.1
icpc -O3 -ipo myprog.cpp

Note that when using IPO you must change your original archive “ar” and linker “ld” to Intel® “xiar” and “xild". Further reading on Interprocedural Optimization (IPO) can be found in the Key Features section of the User and Reference Guide for the Intel® C++ and Intel® Fortran Compilers. Another type of optimization involves writing vectorizable code and applying default optimization of (-02) or higher as described in the Compiler AutovectorizationGuide. Other factors that affect code speed are clock rate, memory bandwidth and code layout as it relates to memory cache effectiveness.

Processor Specific Options

The compiler's default optimization level (-O2) generates optimized code for both Intel and compatible, non-Intel processors of IA-32 or Intel64 architecture that support at least SSE2. Further optimization maybe achieved through IPO (inter-procedural optimization -ipo), PGO (profile-guided optimization -prof_use), and HLO (high-level loop/memory optimizations -O3) and GAP (guided auto parallelization -guide). By default when codes are compiled the processor and instruction set requirements for the baseline code path are determined by the architecture such as the flags listing in /proc/cpuinfo. For example a code compiled on the orca login node would be optimized for the following micro architecture flags:

[roberpj@orc-login2:~] cat /proc/cpuinfo | grep flags | tail -1
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr npt lbrv svm_lock nrip_save pausefilter

The -x Option

Codes however can be compiled for a specific processor (for instance residing on a computer node) using the -x option. The resulting binary will contain Streaming SIMD Extensions and/or Advanced Vector Extensions and therefore only run on processors supporting these instructions. For example the sharcnet orca opteron login nodes do not support SSE4.2 or AVX however the sharcnet orca xeon compute node orc389 for example does, while neither supports CORE-AVX2 which can be demonstrated as follows:

[roberpj@orc-login1:~] icc -xSSE4.2 hello.c
[roberpj@orc-login1:~] ./a.out
Fatal Error: This program was not built to run on the processor in your system.
The allowed processors are: Intel(R) processors with SSE4.2 and POPCNT instructions support.
[roberpj@orc-login2:~]  ssh orc389
[roberpj@orc389:~] ./a.out
Hello World!

[roberpj@orc-login1:~] icc -xAVX hello.c
[roberpj@orc-login1:~] ./a.out
Fatal Error: This program was not built to run in your system.
Please verify that both the operating system and the processor support Intel(R) AVX.
[roberpj@orc-login1:~] ssh orc389
[roberpj@orc389:~] ./a.out
Hello World!

[roberpj@orc-login1:~] icc -xCORE-AVX2 hello.c
[roberpj@orc-login1:~] ./a.out
Fatal Error: This program was not built to run in your system.
Please verify that both the operating system and the processor support Intel(R) AVX2, BMI, LZCNT and FMA instructions.
[roberpj@orc-login1:~] ssh orc389
[roberpj@orc389:~] ./a.out
Fatal Error: This program was not built to run in your system.
Please verify that both the operating system and the processor support Intel(R) AVX2, BMI, LZCNT and FMA instructions.

Where all the possible argument values for -x for the module load intel/15.0.1 compiler are found by doing ...

[roberpj@orc-login2:~] icc -help codegen 

-x<code>  generate specialized code to run exclusively on processors
          indicated by <code> as described below

            SSE2    May generate Intel(R) SSE2 and SSE instructions for Intel
                    processors.  Optimizes for the Intel NetBurst(R)
                    microarchitecture.
            SSE3    May generate Intel(R) SSE3, SSE2, and SSE instructions for
                    Intel processors.  Optimizes for the enhanced Pentium(R) M 
                    processor microarchitecture and Intel NetBurst(R)
                    microarchitecture. 
            SSSE3   May generate Intel(R) SSSE3, SSE3, SSE2, and SSE
                    instructions for Intel processors.  Optimizes for the
                    Intel(R) Core(TM) microarchitecture.
            SSE4.1  May generate Intel(R) SSE4 Vectorizing Compiler and Media
                    Accelerator instructions for Intel processors.  May 
                    generate Intel(R) SSSE3, SSE3, SSE2, and SSE instructions
                    and it may optimize for Intel(R) 45nm Hi-k next generation
                    Intel Core(TM) microarchitecture.
            SSE4.2  May generate Intel(R) SSE4 Efficient Accelerated String
                    and Text Processing instructions supported by Intel(R)
                    Core(TM) i7 processors.  May generate Intel(R) SSE4 
                    Vectorizing Compiler and Media Accelerator, Intel(R) SSSE3,
                    SSE3, SSE2, and SSE instructions and it may optimize for
                    the Intel(R) Core(TM) processor family.
            AVX     May generate Intel(R) Advanced Vector Extensions (Intel(R)
                    AVX), Intel(R) SSE4.2, SSE4.1, SSSE3, SSE3,
                    SSE2, and SSE instructions for Intel(R) processors.
            CORE-AVX2
                    May generate Intel(R) Advanced Vector Extensions 2
                    (Intel(R) AVX2), Intel(R) AVX, SSE4.2, SSE4.1, SSSE3, SSE3,
                    SSE2, and SSE instructions for Intel(R) processors.
            CORE-AVX-I
                    May generate Intel(R) Advanced Vector Extensions (Intel(R)
                    AVX), including instructions in Intel(R) Core 2(TM)
                    processors in process technology smaller than 32nm,
                    Intel(R) SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, and SSE
                    instructions for Intel(R) processors.
            ATOM_SSE4.2
                    May generate MOVBE instructions for Intel(R) processors,
                    depending on the setting of option -minstruction.
                    May also generate Intel(R) SSE4.2, SSE3, SSE2, and SSE
                    instructions for Intel processors. Optimizes for Intel(R)
                    Atom(TM) processors that support Intel(R) SSE4.2 and MOVBE
                    instructions.
            ATOM_SSSE3
                    May generate MOVBE instructions for Intel(R) processors,
                    depending on the setting of option -minstruction.
                    May also generate Intel(R) SSSE3, SSE3, SSE2, and SSE
                    instructions for Intel processors. Optimizes for the
                    Intel(R) Atom(TM) processor that support Intel(R) SSE
                    and MOVBE instructions.
            MIC-AVX512
                    May generate Intel(R) Advanced Vector Extensions 512
                    (Intel(R) AVX-512) Foundation instructions, Intel(R)
                    AVX-512 Conflict Detection instructions, Intel(R) AVX-512
                    Exponential and Reciprocal instructions, Intel(R) AVX-512
                    Prefetch instructions for Intel(R) processors, and the
                    instructions enabled with CORE-AVX2. Optimizes for Intel(R)
                    processors that support Intel(R) AVX-512 instructions.
            CORE-AVX512 
                    May generate Intel(R) Advanced Vector Extensions 512 
                    (Intel(R) AVX-512) Foundation instructions, Intel(R) 
                    AVX-512 Conflict Detection instructions, Intel(R) AVX-512 
                    Doubleword and Quadword instructions, Intel(R) AVX-512 
                    Byte and Word instructions and Intel(R) AVX-512 Vector 
                    Length Extensions for Intel(R) processors, and the 
                    instructions enabled with CORE-AVX2. Optimizes for Intel(R)
                    processors that support Intel(R) AVX-512 instructions. 
-xHost    generate instructions for the highest instruction set and processor
          available on the compilation host machine

The -ax Option

Additionally codes can be compiled with the -ax option. The resulting binary will contain both a default (baseline) code path, as well as one or more micro architecture optimized (feature specific) code paths. This selective runtime capability is possible due to Processor Dispatch Technology using by the Intel Compilers. Currently we expect (pending the outcome of verification testing) additional optimized code paths will be usable on non-intel nodes; in the event they are not consider using the Gnu or Pgi compiler instead. One catch22 when using -ax is the binary will become larger and larger if several processors are added. Some examples follow:

Example 1)
To specify a default code path (other than the default SSE2 minimum requirement) for any intel compatible non-intel processor that supports SSE3 and intel processors that support AVX or SSE4.2 one would do:

icpc -xSSE3 -axAVX,SSE4.2 sample.cpp

Example 2)
The create a binary with optimized code path for SSE4.2 do the following. Note this hello binary runs on orca's AMD opteron login node since a default code path for SSE2 is also embedded by definition:

[roberpj@orc-login2:~] icc -axSSE4.2 hello.c
[roberpj@orc-login2:~] ./a.out
Hello World!

Example 3)
To build a binary additionally optimized for Haswell containing AVX2 and fma instructions do the following:

icpc -axCORE-AVX2 -fma sample.cpp

To build the same binary with a single optimized code path for exclusive use on Haswell do the following:

icpc -xCORE-AVX2 -fma sample.cpp

Example 4)
To specify a more restrictive default code path for processors that support SSE3 and an optimized code path for AVX2 use:

icpc -xSSE3 -axCORE-AVX2 sample.cpp

Where all the possible code argument values for -ax for the module load intel/15.0.1 compiler can be listed by doing:

[roberpj@orc-login2:~] icc -help codegen 

-ax<code1>[,<code2>,...]
          generate code specialized for processors specified by <codes>
          while also generating generic IA-32 instructions.  
          <codes> includes one or more of the following:

            SSE2    May generate Intel(R) SSE2 and SSE instructions for Intel
                    processors.
            SSE3    May generate Intel(R) SSE3, SSE2, and SSE instructions for
                    Intel processors. 

            SSSE3   May generate Intel(R) SSSE3, SSE3, SSE2, and SSE
                    instructions for Intel processors.

            SSE4.1  May generate Intel(R) SSE4.1, SSSE3, SSE3, SSE2, and SSE
                   instructions for Intel processors.

            SSE4.2  May generate Intel(R) SSE4.2, SSE4.1, SSSE3, SSE3, SSE2,
                    and SSE instructions for Intel processors.
            AVX     May generate Intel(R) Advanced Vector Extensions (Intel(R)
                    AVX), Intel(R) SSE4.2, SSE4.1, SSSE3, SSE3,
                    SSE2, and SSE instructions for Intel(R) processors.
            CORE-AVX2
                    May generate Intel(R) Advanced Vector Extensions 2
                    (Intel(R) AVX2), Intel(R) AVX, SSE4.2, SSE4.1, SSSE3, SSE3,
                    SSE2, and SSE instructions for Intel(R) processors.
            CORE-AVX-I
                    May generate Intel(R) Advanced Vector Extensions (Intel(R)
                    AVX), including instructions in Intel(R) Core 2(TM)
                    processors in process technology smaller than 32nm,
                    Intel(R) SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, and SSE
                    instructions for Intel(R) processors.
            MIC-AVX512
                    May generate Intel(R) Advanced Vector Extensions 512
                    (Intel(R) AVX-512) Foundation instructions, Intel(R)
                    AVX-512 Conflict Detection instructions, Intel(R) AVX-512
                    Exponential and Reciprocal instructions, Intel(R) AVX-512
                    Prefetch instructions for Intel(R) processors, and the
                    instructions enabled with CORE-AVX2.
            CORE-AVX512 
                    May generate Intel(R) Advanced Vector Extensions 512 
                    (Intel(R) AVX-512) Foundation instructions, Intel(R) 
                    AVX-512 Conflict Detection instructions, Intel(R) AVX-512
                    Doubleword and Quadword instructions, Intel(R) AVX-512 
                    Byte and Word instructions and Intel(R) AVX-512 Vector 
                    Length Extensions for Intel(R) processors, and the 
                    instructions enabled with CORE-AVX2. 

Further reading on the topics covered in the above Optimization section are provided in the Code Optimization section of the Reference section found at the bottom of this document.

Profiling

Intel provides instrumentation options for profiling codes (comparable to gprof/gprof2dot) documented in articles Profile-Guided Optimization (PGO) located in the Key Features documentation section of the Intel C++ Compiler found here https://software.intel.com/en-us/compiler_15.0_ug_c and the Intel Fortran Compiler found here https://software.intel.com/en-us/compiler_15.0_ug_f. The Intel Data Viewer Utility tool is located at /opt/sharcnet/intel/15.0.1/composer_xe_2015.1.133/bin/intel64/loopprofileviewer.sh

Invoking C++11, C++14, etc.

The sharcnet intel/15.0.1 module supports C++11 by compiling as follows :

module unload intel mkl
module load intel/16.0.3
icpc -std=c++11 myprog.cpp

where a description of C++11 Features Supported by Intel® C++ Compiler is provided in https://software.intel.com/en-us/articles/c0x-features-supported-by-intel-c-compiler.

A table of compiler support for new C++ features (including C++11, C++14, C++17 and various technical specifications) is given in http://en.cppreference.com/w/cpp/compiler_support.

Intel Debuggers

For compiler versions Intel 15 and later the idb debugger command has been replaced by gdb-ia as discussed in https://software.intel.com/en-us/articles/debugging-intel-xeon-phi-applications-on-linux-host.

gdb (legacy systems)

module unload intel mkl openmpi
module load intel/15.0.6
source /opt/sharcnet/intel/15.0.6/composer_xe_2015.6.233/bin/debuggervars.sh
OR
module load intel/16.0.4
source /opt/sharcnet/intel/16.0.4/debugger_2016/bin/debuggervars.sh
OR
module load intel/17.0.4
source /opt/sharcnet/intel/17.0.4/debugger_2017/bin/debuggervars.sh
OR
module load intel/18.0.1
gdb-ia

gdb (graham system)

module unload intel imkl openmpi
export MODULEPATH=/opt/software/modules:$MODULEPATH
module load intelcluster/17.0.5
OR
export MODULEPATH=/opt/software/modules:$MODULEPATH
module load intelcluster/18.0.1
gdb-ia

idb

For compiler versions prior to Intel 15 the gui based idb command may not launch properly on some systems due to java version resource limit incompatibilities. In such situations either use the command line variant idbc or switch to a sharcnet cluster development node or the sharcnet vdi-centos6 machine. For example on an orca development node one would do:

 ssh -Y orca.sharcnet.ca
 ssh -Y orc-devel2
 idb

Invoking Compilers

icc and icpc

As described in Innvoking the Intel® C++ Compiler found here https://software.intel.com/en-us/node/522355 for Linux OS:

You can invoke the Intel® C++ Compiler for Linux OS on the command line by using either icc to compile C source files r icpc to compile C++ source files.

When you invoke the compiler with icc, the compiler builds C source files using C libraries and C include files. If you use icc with a C++ source file, it is compiled as a C++ file. Use icc to link C object files.

When you invoke the compiler with icpc the compiler builds C++ source files using C++ libraries and C++ include files. If you use icpc with a C source file, it is compiled as a C++ file. Use icpc to link C++ object files.

The icc or icpc command does the following:

Compiles and links the input source file(s).
Produces one executable file, a.out, in the current directory.

The extensions for C/C++ defined in Understanding File Extensions here https://software.intel.com/en-us/node/522359.

For C source files use: file.c
For C++ source files use: file.C, file.CC, file.cc, file.cpp or file.cxx

ifort

As described in Innvoking the Intel® Fortran Compiler https://software.intel.com/en-us/node/522357 for Linux OS:

You can invoke the Intel® Fortran Compiler on the command line using the ifort command. The syntax of the ifort command is:

ifort [options]input_file(s)

The ifort command can compile and link projects in one step or compile them then link them as a separate step.

In most cases, a single ifort command will invoke both the compiler and linker. You can also use ld (Linux* OS and OS X*) to build libraries of object modules. These commands provide syntax instructions at the command line if you request it with the help option.

The ifort command automatically references the appropriate Intel® Fortran Run-Time Libraries when it invokes the linker. To link one or more object files created by the Intel® Fortran compiler, you should use the ifort command instead of the link command.

The ifort command invokes a driver program that is the user interface to the compiler and linker. It accepts a list of command options and file names and directs processing for each file. The driver program does the following:

Calls the Intel® Fortran Compiler to process Fortran files.
Passes the linker options to the linker.
Passes object files created by the compiler to the linker.
Passes libraries to the linker.
Calls the linker or librarian to create the executable or library file.

Release Note Changes

o An error message about -lguide not found in Intel 11 occur since libgruide.a and libguide.so is deprecated in the newer releases of the intel compiler suite, replaced by libiomp and liboimp5 in release 12. To remedy this -lguide should be replaced with -liomp5.

Profmerge Workaround

Sharcnet no longer set LD_LIBRARY_PATH in modules to enable programs which can be compiled in favor of LD_RUN_PATH to ensure rpath is built in. Since the profmerge binary is part of the compiler it is therefore not set. Users must do this manually (for intel versions > 12.1.3) as follows:

[roberpj@orc-login1:~] module load intel/15.0.3
[roberpj@orc-login1:~] profmerge
profmerge: error while loading shared libraries: libcilkrts.so.5: cannot open shared object file: No such file or director
[roberpj@orc-login1:~] export LD_LIBRARY_PATH=/opt/sharcnet/intel/15.0.3/composer_xe_2015.3.187/compiler/lib/intel64/:$LD_LIBRARY_PATH
[roberpj@orc-login1:~] profmerge
remark #30056: no .dyn files to merge.

References

o Intel Parallel Studio XE (Sharcnet has Composer Edition)
https://software.intel.com/en-us/intel-parallel-studio-xe/try-buy

o Intel® Parallel Studio XE Release Notes
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-release-notes

o Developer Reference Intel® MKL 2018 Update 2 (sparse solver routines --> pardiso)
https://software.intel.com/en-us/mkl-developer-reference-fortran-what-s-new

Intel 17

o Intel® C++ Compiler 17.0 Developer Guide and Reference
https://software.intel.com/en-us/intel-cplusplus-compiler-17.0-user-and-reference-guide

o Intel® Fortran Compiler 17.0 Developer Guide and Reference
https://software.intel.com/en-us/intel-fortran-compiler-17.0-user-and-reference-guide

o Intel® Parallel Studio XE 2017 Initial Release Readme
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2017-initial-release-readme

Intel 16

o Intel® C++ Compiler 16.0 User and Reference Guide
https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide

o Intel® Fortran Compiler 16.0 User and Reference Guide
https://software.intel.com/en-us/intel-fortran-compiler-16.0-user-and-reference-guide

o Intel® Parallel Studio XE 2016 Update 4 Readme
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2016-update-4-readme

o Intel Fortran Compiler 16.0 Update 4 for Linux* Release Notes for Intel Parallel Studio XE 2016
https://software.intel.com/en-us/articles/intel-fortran-compiler-160-for-linux-release-notes-for-intel-parallel-studio-xe-2016

Intel 15

o Intel® Parallel Studio XE 2015 Composer Edition C++ Release Notes
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-composer-edition-c-release-notes

o Intel® Fortran Compiler 15.0 Release Notes
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-composer-edition-fortran-release-notes

o Intel® Parallel Studio XE 2015 Update 3 Composer Edition for C++ Linux
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-update-3-composer-edition-for-c-linux

o Intel® Parallel Studio XE 2015 Update 3 Composer Edition for Fortran Linux
https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-update-3-composer-edition-for-fortran-linux

o Intel® Parallel Studio XE 2015 Composer Edition Compilers Fixes List
https://software.intel.com/en-us/articles/intel-composer-xe-2015-compilers-fixes-list

o Intel User and Reference Guide for the Intel® C++ Compiler 15.0
https://software.intel.com/en-us/compiler_15.0_ug_c

o User and Reference Guide for the Intel® Fortran Compiler 15.0
https://software.intel.com/en-us/compiler_15.0_ug_f

Compiler Support Topics

o Intel Developer Zone Support
https://software.intel.com/en-us/support

o Intel Developer Forums for Asking Questions

https://software.intel.com/en-us/forum
https://software.intel.com/en-us/forums/opencl
https://software.intel.com/en-us/forums/intel-c-compiler
https://software.intel.com/en-us/forums/intel-inspector-xe
https://software.intel.com/en-us/forums/intel-math-kernel-library
https://software.intel.com/en-us/forums/intel-many-integrated-core
https://software.intel.com/en-us/forums/intel-threading-building-blocks
https://software.intel.com/en-us/forums/intel-integrated-performance-primitives
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x

o C++11 Features Supported by Intel® C++ Compiler
https://software.intel.com/en-us/articles/c0x-features-supported-by-intel-c-compiler

o Getting Started with the Intel C++ Compiler
https://www.sharcnet.ca/Software/Intel/ICC/

o Getting Started with the Intel Fortran Compiler
https://www.sharcnet.ca/Software/Intel/IFC/

o Intel Guide for Developing Multithreaded Applications - Parts 1 & 2
https://www.sharcnet.ca/Software/Intel/docs/100407_Parallel_Programming_01.pdf
https://www.sharcnet.ca/Software/Intel/docs/100412_Parallel_Programming_02.pdf

o Intel Parallel Debugger Extensions for Intel Composer XE (version 12 only)
http://software.intel.com/en-us/articles/parallel-debugger-extension/

o Intel Compiler Version Number Mapping
http://software.intel.com/en-us/articles/intel-compiler-and-composer-update-version-numbers-to-compiler-version-number-mapping

Code Generation

o Step by Step Performance Optimization with Intel® C++ Compiler
https://software.intel.com/en-us/articles/step-by-step-optimizing-with-intel-c-compiler

o Targeting IA-32 and Intel(R) 64 Architecture Processors Automatically
https://www.sharcnet.ca/Software/Intel/IntelIFC/compiler_f/main_for/mergedProjects/optaps_for/common/optaps_dsp_targ.htm

o Targeting Multiple IA-32 and Intel(R) 64 Architecture Processors for Run-time Performance
https://www.sharcnet.ca/Software/Intel/IntelICC/compiler_c/main_cls/mergedProjects/optaps_cls/common/optaps_dsp_qax.htm

o IA-32 and Intel®64 Processor Targeting Overview
https://software.intel.com/en-us/articles/ia-32-and-intel-64-processor-targeting-overview

o Inte® Compiler Options for Intel® SSE and Intel® AVX generation and processor-specific optimizations - Part 1of2
https://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations

o Performance Tools for Software Developers - SSE generation and processor-specific optimizations continued - Part 2of2
https://software.intel.com/en-us/articles/performance-tools-for-software-developers-sse-generation-and-processor-specific-optimizations-continue

o Intel AVX2 Support in the Intel® C++ Compiler
https://software.intel.com/en-us/articles/intel-system-studio-avx2-support

Haswell Related

o I.ntel’s Haswell CPU Microarchitecture
http://www.realworldtech.com/haswell-cpu/

o I.ntels Haswell an unprecedented threat to Nvidia, AMD
http://www.extremetech.com/computing/136219-intels-haswell-is-an-unprecedented-threat-to-nvidia-amd

o H.aswell New Instruction Descriptons Now Available!
https://software.intel.com/en-us/blogs/2011/06/13/haswell-new-instruction-descriptions-now-available

o B.enchmarks: Haswell vs. Ivy Bridge for Financial Analytics
http://blog.xcelerit.com/benchmarks-haswell-vs-ivy-bridge-for-financial-analytics/

Vector Programming

o Fortran Array Data and Arguments and Vectorization
https://software.intel.com/en-us/articles/fortran-array-data-and-arguments-and-vectorization

o Getting the most out of the Intel compiler with new optimization reports
https://software.intel.com/en-us/videos/getting-the-most-out-of-the-intel-compiler-with-new-optimization-reports

o Explicit Vector Programming
https://software.intel.com/en-us/tags/43556

o Requirements for Vectorizable Loops
https://software.intel.com/en-us/articles/requirements-for-vectorizable-loops

o Explicit Vector Programming in Fortran
https://software.intel.com/en-us/articles/explicit-vector-programming-in-fortran

Webinars

Developement Tools Webinars
https://software.intel.com/en-us/events/development-tools-webinars

Versions

Intel Parallel Studio XE Version Numbering
https://software.intel.com/en-us/articles/intel-compiler-and-composer-update-version-numbers-to-compiler-version-number-mapping

Which version of the Intel® IPP, Intel® MKL and Intel® TBB Libraries are Included in the Intel® Composer Bundles?
https://software.intel.com/en-us/articles/which-version-of-the-intel-ipp-intel-mkl-and-intel-tbb-libraries-are-included-in-the-intel