From Documentation
Jump to: navigation, search
Description: The AMD Core Math Library including lapack, blasé, fft
SHARCNET Package information: see ACML software page in web portal
Full list of SHARCNET supported software


ACML is a set of threaded math routines optimized for high performance on AMD Opteron™ processors. It consists of the BLAS, LAPACK, FFTs and Random Number Generators and supports GFORTRAN, Intel Fortran, NAG Fortran, Open64 and PGI Fortran.

The ACML library is closed source, distributed as a set of binaries compatible with various compilers. The user needs to make sure that the library version selected and the compiler used are compatible.

Version Selection

The appropriate compiler and corresponding module should be loaded prior to linking your code.

MODULE VARIANTS (int64 & mp)

o acml/gfortran/5.3.0 - Functions expect INTEGER (32-bit) arguments.
o acml/gfortran/5.3.0_mp - Use when the program is compiled for OpenMP.
o acml/gfortran-int64/5.3.0 - Functions expect INTEGER*8 (64-bit) arguments.
o acml/gfortran-int64/5.3.0_mp - INTEGER*8 (64-bit) with OpenMP.
o acml/ifort/5.3.0 - Intel INTEGER (32-bit).
o acml/ifort/5.3.0_mp - Intel INTEGER (32-bit) with OpenMP.
o acml/ifort-int64/5.3.0 - Intel INTEGER (64-bit).

For a complete list of install acme modules issue the following command:

module avail acml

Job Submission

Jobs requiring ACML should have a flag in the compiling command indicating the required libraries. See the next section which illustrates this procedure.

Examples of Job Compilation

For these examples we loaded following modules: acml/ifort/5.3.0 intel/12.1.3

fortran dgemm example

Use following command to compile file test_dgemm.f90:

$FC test_dgemm.f90 -llapack


! file name = test_dgemm.f90

      program mainp1
      implicit none
      integer, parameter :: HEIGHT=4
      integer, parameter :: WIDTH=3
      integer, parameter :: K=1
      integer            :: i, j
      double precision   :: ColumnVector(HEIGHT,K)
      double precision   :: RowVector(K,WIDTH)
      double precision   :: Result(HEIGHT,WIDTH)
      double precision   :: ALPHA, BETA
      character*1        :: NoTrans

      ALPHA = 1.0e0
      BETA  = 0.0e0

      do i=1,HEIGHT
        ColumnVector(i,K) = i
      do j=1,WIDTH
        RowVector(K,j) = j
      call PrintMatrix(ColumnVector, HEIGHT,K)
      call PrintMatrix(RowVector, K, WIDTH)
 !    To do the calculation, we will use the BLAS function dgemm. 
 !    This function calculates:  C = ALPHA*A*B + BETA*C 
      NoTrans  =  'N'
      call dgemm(NoTrans,NoTrans,HEIGHT,WIDTH,1,ALPHA,        &
     &     ColumnVector,HEIGHT,RowVector,1,BETA,Result,HEIGHT)
      call PrintMatrix(Result, HEIGHT, WIDTH)
       subroutine PrintMatrix(pMatrix,nRows,nCols)
       implicit none
       integer            :: i, j, nRows, nCols
       double precision   :: pMatrix(nRows,nCols)
       do i=1,nRows
         do j=1,nCols
           print *,i,j,pMatrix(i,j)
       print *," "

fortran zdotc example

Use following command to compile:

$FC fortran_zdotc.f90 $CPPFLAGS $LDFLAGS  -lacml -lifcoremt_pic


! file name = fortran_zdotc.f90

      program fortran_zdotc
      implicit none

      integer, parameter :: NN=5
      integer :: n, inca, incb
      integer :: i

      REAL*8 :: Di, Dn

      inca = 1
      incb = 1

      n  = NN
      Dn = DBLE(n)

      print *,""

      DO i=0,n-1
        Di = i
        ZX(i) = CMPLX(Di,2.0D0*Di)
        ZY(i) = CMPLX(Dn-Di,2.0D0*Di)
        write(6,1001) ZX(i),ZY(i)
 1001   format("(",f6.2,",",f6.2,")    (",f6.2,",",f6.2,")")
      END DO

      ZDPXY = ZDOTC(n,ZX,inca,ZY,incb)
      ZDPYX = ZDOTC(n,ZY,incb,ZX,inca)

 1002   format("(",f6.2,",",f6.2,")")
      print *,""
      print *,"<ZX,ZY>"
      write(6,1002) ZDPXY

      print *,""
      print *,"<ZY,ZX>"
      write(6,1002) ZDPYX
      print *,""

      print *,"Job completed successfully"
      print *,""

      end program fortran_zdotc

c zdotc example

Use following command to compile:

$CC main_zdotc.c $CPPFLAGS $LDFLAGS  -lacml


/* file name = main_zdotc.c */

/* The following example illustrates a call from a C program to the 
 * complex BLAS Level 1 function zdotc(). This function computes 
 * the dot product of two double-precision complex vectors.        

   DOT_PRODUCT = <ZX,ZY> = SUM[i=0,i=n-1] { DCONJG(ZX(I)) * ZY(I) }
                                  ------------- * ----
   Note that <ZX,ZY> = DCONJG(<ZY,ZX>)

 * In this example, the complex dot product is returned in the structure c. 

#include "acml.h"
#define N 5 

/* void zdotc(); */
extern doublecomplex zdotc(int n, doublecomplex *x, int incx, doublecomplex *y, int incy);

int main() {
  int n, inca = 1, incb = 1, i;

  int DEBUG=1;

/*  typedef struct {...}  MKL_Complex16;   defined in "mkl.h"    */

  doublecomplex a[N], b[N], c, d;
  n = N;


  for ( i = 0; i < n; i++ ){
    a[i].real = (double)i;
    a[i].imag = (double)i * 2.0;

    b[i].real = (double)(n - i);
    b[i].imag = (double)i * 3.0;

    printf(" ( %6.2f, %6.2f) ( %6.2f, %6.2f) \n",a[i].real,a[i].imag,b[i].real,b[i].imag);

/*  For MKL had: 
    zdotc( &c, &n, a, &inca, b, &incb ); 
    zdotc( &d, &n, b, &incb, a, &inca ); */

  c = zdotc( n, a, inca, b, incb );
  d = zdotc( n, b, incb, a, inca );

  printf("The complex dot product <a|b> is: ( %6.2f, %6.2f) \n", c.real, c.imag );
  printf("The complex dot product <b</b>|a> is: ( %6.2f, %6.2f) \n", d.real, d.imag );
  printf("Job completed successfully\n");


General Notes

Example Linking To ACML With Intel Compiler (int64 & mp)

module unload mkl
module load acml/ifort-int64/5.3.0_mp
Then modify your makefile to link instead to acml using:
$CPPFLAGS $LDFLAGS  -lacml_mp -lifcoremt_pic


When compiling with gnu compilers such as gfortran, one must take special care to link against the compatible version of acml. To find out which Fortran version was used to compile the ACML library, use the following command:

grep 'Fortran compiler' /opt/sharcnet/acml/*/*/*/examples/acmlinfo.expected

If you only need this info for GNU compilers then run for example:

[roberpj@orc-login2:~] grep 'Fortran compiler' /opt/sharcnet/acml/*/gfortran-64bit/gfortran64/examples/acmlinfo.expected
/opt/sharcnet/acml/4.2.0/gfortran-64bit/gfortran64/examples/acmlinfo.expected:Built using Fortran compiler: GNU Fortran (GCC) 3.3 20030312 (prerelease) (SuSE Linux)
/opt/sharcnet/acml/4.3.0/gfortran-64bit/gfortran64/examples/acmlinfo.expected:Built using Fortran compiler: GNU Fortran (GCC) 3.3 20030312 (prerelease) (SuSE Linux)
/opt/sharcnet/acml/4.4.0/gfortran-64bit/gfortran64/examples/acmlinfo.expected:Built using Fortran compiler: GNU Fortran (GCC) 4.3.2
/opt/sharcnet/acml/5.1.0/gfortran-64bit/gfortran64/examples/acmlinfo.expected:Built using Fortran compiler: GNU Fortran (GCC) 4.6.0
/opt/sharcnet/acml/5.2.0/gfortran-64bit/gfortran64/examples/acmlinfo.expected:Built using Fortran compiler: GNU Fortran (GCC) 4.7.1
/opt/sharcnet/acml/5.3.0/gfortran-64bit/gfortran64/examples/acmlinfo.expected:Built using Fortran compiler: GNU Fortran (GCC) 4.7.1


ACML is intended (optimized) for use on AMD Opteron systems however is installed on Intel based systems for codes that require it. Codes that statically allocate data (.bss) objects such as Fortran COMMON blocks and C variables with file scope larger than 2GB will likely require the addition of -mcmodel=medium to both examples above. Otherwise a fatal error message such as "relocation truncated to fit R_X86_64_PC32" might occur.

Also see:


o AMD Developer Homepage

o When Should The ACML int64 Versions Be Used ?

o How to use ACML with different versions of GCC/GFORTRAN