From Documentation
Revision as of 20:27, 28 February 2019 by Roberpj (Talk | contribs) (=Using the compute canada (graham) module)

Jump to: navigation, search
STAR-CCM+
Description: Engineering process for solving problems involving flow (of fluids or solids), heat transfer and stress.
SHARCNET Package information: see STAR-CCM+ software page in web portal
Full list of SHARCNET supported software


Introduction

To use STAR-CCM+ on sharcnet, a research group must provide provide their own license file for the sharcnet license server - to make this arrangement please submit a problem ticket. The current installation of STAR-CCM+ provides loadable modules for both single and double precision versions.

Version Selection

Graham Cluster

To see available modules on Graham run:

[roberpj@gra-login2:~] module avail starccm
------------------------ Core Modules ---------------------
   starccm-mixed/11.06.011 (phys)    starccm-mixed/12.06.011 (phys)      starccm/11.06.011-R8 (phys)    starccm/12.06.011-R8 (phys)
   starccm-mixed/12.04.011 (phys)    starccm-mixed/13.04.010 (phys,D)    starccm/12.04.011-R8 (phys)    starccm/13.04.010-R8 (phys,D)

To load one of the modules do as usual:

module load starccm/12.04.011

First however you must configure a starccm.lic file.
On Graham starccm.lic can be setup to route traffic out through a sharcnet proxy to reach the remote PoD server:

[roberpj@gra-login3:~/.licenses] cat starccm.lic
SERVER license3.sharcnet.ca ANY 1999
USE_SERVER

All modern national systems including Graham also support connecting directly to the remote PoD server by doing:

[roberpj@gra-login3:~/.licenses] cat starccm.lic
SERVER flex.cd-adapco.com ANY 1999
USE_SERVER

Legacy Clusters

To load the mixed (single) or double precision versions:

  module unload intel openmpi
  module load gcc/4.9.2 openmpi/gcc492-std/1.8.7
  module load starccmplus/mixed/version#   OR  module load starccmplus/double/version#

where:

version# = 10.02.012, 10.06.009 or 11.06.011 

Notes:
o Beginning Jan 28, 2015 both gcc and openmpi/gcc modules must be loaded before the starccmplus module to submit a job to the mpi queue.
o Starting with version 10.02.012 specifying starccmplus to sqsub is sufficient, instead of starccmplus-openmpi as was required by 9.04.009.
o Also starting with version 10.02.012 the single sharcnet module is renamed mixed since some fields are handled as doubles by star such as coordinates, pressure and displacment as further explained in the User Guide > Using STAR-CCM+® > Working with Mixed or Double Precision, for more details if interested.

Job Submission

Graham & Orca Cluster

Starccmplus jobs are submitted on graham and new orca by creating a slurm script as described on https://docs.computecanada.ca/wiki/Star-CCM+ and then submitting the job with:

sbatch myslurmscript.sh

License Server Setup

The CDLMD_LICENSE_FILE environment variable must also be defined in the slurm script, consistent with how the starccm.lic file was created:

A.1) Users who defined their PodKey license to connect directly should set:

  export CDLMD_LICENSE_FILE="1999@flex.cd-adapco.com"

A.2) Users who defined their PodKey license to connect via the sharcnet should set:

  export CDLMD_LICENSE_FILE="1999@license3.sharcnet.ca"

B) Users with a remote institutional license server setup in their slurm script, should similarly set:

  export CDLMD_LICENSE_FILE="someport@someserver"

where a ticket should be first opened prior to submitting any jobs to request verification the server is reachable.

Sample Slurm Submit Script

[roberpj@gra-login2:~] cat myscript
#!/bin/bash
#SBATCH --time=0-01:00               # Time limit: d-hh:mm
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4          # must be 2 or more
#SBATCH --mem=16000M
#SBATCH --job-name=myscript
#SBATCH --output=slurm-%x.%J.out
module load starccm/13.04.010-R8
export LM_PROJECT='z7aJYeD09XV+t/aeLf9VYg'
export CDLMD_LICENSE_FILE="1999@flex.cd-adapco.com"
export STARCCM_TMP="${SCRATCH}/.starccm-${EBVERSIONSTARCCM}"
mkdir -p "$STARCCM_TMP"
slurm_hl2hl.py --format STAR-CCM+ > machinefile
NCORE=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
starccm+ -power -np $NCORE -machinefile machinefile -batch $HOME/pathto/mysimulation.sim

If unspecified, the default memory per core assigned by the schedular is only 256mb as mentioned on https://docs.computecanada.ca/wiki/Running_jobs#Use_sbatch_to_submit_jobs. If your job crashes due to a suspected memory shortage then try doubling it by setting Y=2*cpus-per-task*256M in #SBATCH --mem=YM. If your still crashes try reserving all memory on the node with #SBATCH --mem=0 although this is generally only done when specifying all cores on a single node with for example #SBATCH --cpus-per-task=32 the only potential drawback being this will only choose smallest memory nodes.

Legacy Clusters

One can purchase either a Sharcnet License which is hosted on sharcnets local license server OR one can purchase a PoD License which is hosted on a remote the license server. Depending on which you have use the correspdonding sqsub command shown here:

License File Holders

License file holders (non-POD license users) consists of groups who purchase a license file to be run on either a sharcnet license server, non-sharcnet departmental license server OR a remote commercial license server. All groups in this class currently should set the CDLMD_LICENSE_FILE variable using either:

export CDLMD_LICENSE_FILE=port#@license#.sharcnet.ca

OR

export CDLMD_LICENSE_FILE=/opt/sharcnet/starccmplus/##.##.###/license/license.dat-groupname

o where port# and license# will be assigned via a ticket email
o where ##.##.### corresponds to the module load version being used
o where groupname corresponds to your sponsors username (currently only vwdcfd1 is setup)

Once CDLMD_LICENSE_FILE is set, and the modules loaded (as shown in the above version section) jobs maybe submitted using sqsub:

No Java Macro

sqsub --nompirun -r 60m -q mpi -n 4 --mpp=5G -o ofile.%J starccmplus-openmpi mytestjob.sim [-power]

With Java Macro

sqsub --nompirun -r 60m -q mpi -n 4 --mpp=5G -o ofile.%J starccmplus-openmpi -batch mymacro.java mytestjob.sim [-power]

Orca Submission

When packing nodes one must specify full node(s) otherwise the job will not start. According to https://www.sharcnet.ca/my/systems/show/73 each orca opteron node has 24 cores therefore to use the --pack option on orca, replace N*24 with an integer where N is the number of fully populated nodes. If you have a "Power Session license" then include -power without the square brackets or the job will fail. If you have a "cmpsuite license" do not include the -power option.

sqsub --nompirun -r 60m -q mpi --mpp=5G -n N*24 --pack -f opteron -o ofile.%J starccmplus-openmpi mytestjob.sim [-power]

where the [square] brackets around the [-power] argument mean -power is optional.

PoD License Users

When submitting jobs that use a POD license enter the ~27 digit podkey where xxxx ss shown in the following sqsub commands. Do NOT change the "-licpath 1999@license3.sharcnet.ca" argument! Note that the sqsub -licpath argument overrides the CDLMD_LICENSE_FILE environment variable.

No Java Macro

sqsub --nompirun -r 60m -q mpi -n 8 --mpp=6G -o ofile.%J starccmplus-openmpi mytestjob.sim [-power]
  -licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

With Java Macro

sqsub --nompirun -r 60m -q mpi -n 8 --mpp=6G -o ofile.%J starccmplus-openmpi -batch mymacro.java mytestjob.sim [-power]
-licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

NAPP Queue

 sqsub -q NAP_1234 -f mpi --nompirun -r 2d -n 16 --mpp=6G -o ofile.%J starccmplus-openmpi mytestjob.sim 
-licpath 1999@license3.sharcnet.ca -power -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

Old Orca Submission (obsolete)

To use the --pack option on orca, replace N*24 with a integer such as 24, 48,...N*24 in the following:

sqsub -q NAP_1234 -f mpi --nompirun -r 2d -n N*24 --pack -f opteron --mpp=5G -o ofile.%J starccmplus-openmpi mytestjob.sim
-licpath 1999@license3.sharcnet.ca -power -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

Example Job

Queue Submission

ssh orca.sharcnet.ca
module unload intel openmpi
module load gcc/4.9.2 openmpi/gcc492-std/1.8.7

module load starccmplus/double/10.06.009

cp -a /opt/sharcnet/starccmplus/10.06.009/STAR-CCM+VerificationSuite10.06.009/VerificationData/shockTube .
cd shockTube

Case1a)
export CDLMD_LICENSE_FILE=XXXX@licenseX.sharcnet.ca
sqsub --nompirun -r 60m -q mpi -n 4 --mpp=4G -o ofile.%J starccmplus shockTube_ShockTube_final.sim [-power]

Case1b)
sqsub --nompirun -r 60m -q mpi -n 4 --mpp=4G -o ofile.%J starccmplus shockTube_ShockTube_final.sim [-power]
                -licpath XXXX@licenseX.sharcnet.ca

Case2)
export CDLMD_LICENSE_FILE=/opt/sharcnet/starccmplus/##.##.###/license/license.dat-groupname
sqsub --nompirun -r 60m -q mpi -n 4 --mpp=4G -o ofile.%J starccmplus shockTube_ShockTube_final.sim [-power]

Case3)
sqsub --nompirun -r 60m -q mpi -n 4 --mpp=4G -o ofile.%J starccmplus shockTube_ShockTube_final.sim [-power]
            -licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

Where
o Case 1a) and Case 1b) show two different ways of doing the same thing.
o Case2) only the licpath should NOT be changed. Enter your own value for the podkey!

Interactive Testing

To quickly verify your podkey license works as expected on orca, do the following on an orca developement node. Note that if you get an out of memory error for java try a different devel node such as orc-dev2, orc-dev3 or orc-dev4 which ever is less busy:

ssh orca.sharcnet.ca
ssh orc-dev1
module load starccmplus/double/9.04.009
starccm+ shockTube_ShockTube_final.sim -np 16 -rsh ssh -mpidriver platform -batch [-power]
        -licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

If starccm+ successfully checks out a PoD license the following line will be printed somewhere near the beginning of the screen output:

1 copy of ccmppower checked out from 1999@license3.sharcnet.ca

Assuming that happens the job should continue running. Simply hit ^C to exit when satisfied. If there are problems open a problem ticket and paste your steps and the screen output.

General Notes

Job Crashes On Startup

If you get either error message shown below then increase the sqsub mpp value to at least --mpp=3.5G or --mpp=5G for larger simulations. If your job still fails with 5G then try submitting a series of jobs to find the minimum required mpp value. For instance specify mpp=5.5G, mpp=6G, mpp=7G and so forth. Once a job starts then kill the queued jobs which had larger mpp values. Note that its very important to specify the minimum possible mpp to minimize queue wait times.

Job arguments: shockTube_ShockTube_final.sim -power
Error occurred during initialization of VM
Cannot create VM thread. Out of system resources
There is insufficient memory for the Java Runtime Environment to continue.
pthread_getattr_np
An error report file with more information is saved as:
Cannot allocate memory for thread-local data: ABORT

Check License Status

Research groups can check the status of their respective license (including expiration dates and issued feature limits) by running the starstat command. If any jobs are running they will be shown as follows:

Local Sharcnet License Server

Shows how a sharcnet researcher who has purchased a license for the sharcnet license server can query his current (real-time) license consumption status. Notice that there are 30cores in use sharced among 6 jobs therefore the needed hpc cores are calculated as: (30-6)=24.

[roberpj@red-admin:~] module unload intel openmpi
[roberpj@red-admin:~] module load gcc/4.9.2 openmpi/gcc492-std/1.8.7 starccmplus/double/10.06.009
[roberpj@red-admin:~] export CDLMD_LICENSE_FILE=XXXX@licenseX.sharcnet.ca
[roberpj@red-admin:~] sqjobs -r
 jobid queue state ncpus nodes time command
283060   mpi     R     5 red21 5.8h starccmplus rotor_set7_4d5m_1st_relax_10ms_0
283061   mpi     R     4 red21 5.8h starccmplus rotor_set7_4d5m_1st_relax_8ms.si
283062   mpi     R     4 red21 5.8h starccmplus rotor_set7_4d5m_1st_relax_6ms.si
283065   mpi     R     5 red21 5.0h starccmplus rotor_set7_4d5m_1st_relax_11ms_0
283077   mpi     R     6 red21 1.5h starccmplus rotor_set7_4d5m_1st_relax_9ms_00
283078   mpi     R     6 red22 1.5h starccmplus rotor_set7_4d5m_1st_relax_7ms_00

[roberpj@red-admin:~]  starstat
Feature                 Version   # licenses    Expires         Vendor
ccmpsuite               2014.02       6         28-feb-2014     cdlmd
hpcdomains              2014.02       24        28-feb-2014     cdlmd
Users of ccmpsuite:  (Total of 6 licenses issued;  Total of 6 licenses in use)
Users of hpcdomains:  (Total of 24 licenses issued;  Total of 24 licenses in use)

Remote Podkey License Server

To inspect the license server status for star jobs running on the cluster carry out the following four steps.

I) First load up a sharcnet starccmplus module which puts the starstat command into your path:

[roberpj@win241:~] module unload intel openmpi
[roberpj@win241:~] module load gcc/4.9.2 openmpi/gcc492-std/1.8.7 starccmplus/double/11.06.011
[roberpj@win241:~] export CDLMD_LICENSE_FILE=1999@license3.sharcnet.ca

II) Now you can check the status of the remote podkey license server run command:

[roberpj@win241:~] starstat | grep -v start
Using CDLMD_LICENSE_FILE = 1999@license3.sharcnet.ca
lmutil - Copyright (c) 1989-2014 Flexera Software LLC. All Rights Reserved.
License server status: 1999@flex3.cd-adapco.com
    License file(s) on flex3.cd-adapco.com: /home/flexlm/PoDlmgrd/flex3.dat:
flex3.cd-adapco.com: license server UP (MASTER) v11.14.0
Vendor daemon status (on flex3.cd-adapco.com):
     cdlmd: UP v11.14.0
Feature usage info:
Users of ccmppower:  (Total of 4000 licenses issued;  Total of 1154 licenses in use)
  "ccmppower" v3000.0, vendor: cdlmd, expiry: 03-jul-2018
  floating license
Users of print:  (Total of 1 license issued;  Total of 0 licenses in use)
Users of shutdown:  (Total of 1 license issued;  Total of 0 licenses in use)
Users of starsuite:  (Total of 4000 licenses issued;  Total of 0 licenses in use)
Feature                         Version     #licenses    Expires      Vendor
_______                         _________   _________    __________   ______
ccmppower                       3000.0       4000        03-jul-2018  cdlmd
print                           1.0          1           03-jul-2018  cdlmd
shutdown                        1.0          1           03-jul-2018  cdlmd
starsuite                       3000.0       4000        03-jul-2018  cdlmd

III) To list all jobs you have registered as running on the remote podkey license server run command:

 [roberpj@win241:~] starstat | grep $USER
    roberpj win2 /dev/tty (v2015.10) (flex3.cd-adapco.com/1999 93442), start Mon 3/24 10:10
    roberpj win9 /dev/tty (v2015.10) (flex3.cd-adapco.com/1999 93442), start Wed 5/24 12:15

IV) The sample output in III indicates two jobs are running - list their jobs numbers by running:

[roberpj@win241:~] sqjobs
jobid queue state ncpus    nodes  time command
----- ----- ----- ----- -------- ----- -------
11163   mpi     R    32   win[1-3]  48.1h starccmplus-openmpi wingtestjob1.sim -p
11178   mpi     R    32   win[4-9]  11.7h starccmplus-openmpi wingtestjob2.sim -p

Running the Gui

To run starccm+ in full graphical mode for simulation setup or post-processing a wrapper script starccmplu-gui should be used. Before doing this however, one must first establish a graphical connection to either a sharcnet visualization workstation or a cluster developement node as explained in the following steps:

On the Compute Canada gra-vdi Machine

Using the Legacy Sharcnet Module

This modules are part of the SnEnv environment. To get started ...
Connect to gra-vdi.computecanada.ca with tigervnc, open a terminal window and run:

module load SnEnv
modue load starccm/r8/13.06.012 OR module load starccm/mixed/13.06.012
export CDLMD_LICENSE_FILE=port#@servername
starccm-gui

=Using the Compute Canada Module

Coming soon. The modules are part of the default CcEnv environment.

On a Development Node

o First connect to a orca login node. If connecting from linux or mac use:

ssh -Y orca.sharcnet.ca  OR  ssh -Y saw.sharcnet.ca

If connecting from Windows use either:
a) xming as described here.
b) mobaxterm as described here.

o Next connect into an orca developement node using ssh, load a star module, then launch starview+, or starccm+ as shown here:

ssh -Y orc-dev1   OR   ssh -Y saw-dev1 
module unload intel openmpi
module load gcc/4.9.2 openmpi/gcc492-std/1.8.7 starccmplus/mixed/10.06.009

# Users with a local sharcnet license should do: 
export CDLMD_LICENSE_FILE=XXXX@licenseX.sharcnet.ca
starccmplus-gui shockTube_ShockTube_final.sim [-power]

# Users with a remote podkey license should do:
starccmplus-gui shockTube_ShockTube_final.sim [-power] -licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx 

Note that starccmplus-gui starts up starccm+ in parallel mode with "-np 4" cores which is assumed to be an optimal default. This can be overriden by appending the "-np #" switch. To revert to serial mode, once in the gui, on the pull down menu click File -> Load Simulation and select Serial then click Ok, the session should restart in serial mode.

On Visualization Machines

First you must install the VNC client application tigervnc which will enable you to connect efficiently and securely to vdi-centos6.user.sharcnet.ca, vdi-centos6.user.sharcnet.ca or viz10-uwo.sharcnet.ca. Once connected a remote desktop should appear (sometimes more than one connection attempt is required). Login with your sharcnet username and password and then open a terminal window by clicking "Applications -> System Tools -> XTerm" and do one of the following where the items between the square brackets are optional:

Sharcnet License

o Users with a local sharcnet license should do:

module load starccmplus/mixed/10.06.009
export CDLMD_LICENSE_FILE=XXXX@licenseX.sharcnet.ca
starccmplus-gui -new [-np 4] [-power]
OR
starccmplus-gui shockTube_ShockTube_final.sim [-np 4] [-power]

PoDkey License

o Users with a remote PoDkey license should do, where the power switch in the rectangular brackets is optional:

module load starccmplus/mixed/10.06.009
starccmplus-gui -new [-power] [-np 4] -licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx
OR
starccmplus-gui shockTube_ShockTube_final.sim [-power] [-np 4] -licpath 1999@license3.sharcnet.ca -podkey xxxxxxxxxxxxxxxxxxxxxxxxxxx

Note that starccmplus-gui starts up starccm+ in parallel mode with "-np 4" cores which is assumed to be an optimal default. This can be overriden by appending the "-np #" switch. To revert to serial mode, once in the gui, on the pull down menu click File -> Load Simulation and select Serial then click Ok, the session should restart in serial mode.

Flexmrc File Issues

When running jobs starccm+ automatically creates a ~/.flexlmrc file in your home directory and appends the license server you specify to the CDLMD_LICENSE_FILE. This can become a problem if your sharcnet license file is moved to a new license server since your .flexlmrc file will retain the previous value and append the new value, and then if you exhaust your current license (in such case a DENIED entry will appear in your license servers log file) starccm+ will recursively go through the other servers listed by CDLMD_LICENSE_FILE attempt to use each one until the job runs. In the event another research group was assigned to your old license server this could lead to obvious complications and unexpected failing jobs. To fix the problem, edit the file and ensure there is only ONE license server specified in your flex config file as shown below:

[roberpj@hnd20:~] cat .flexlmrc
CDLMD_LICENSE_FILE=XXXX@licenseX.sharcnet.ca

A corner case for this problem can occur for users who have BOTH a sharcnet license and a podkey license, and like to alternate using them. Prior to every job one must check to verify your flexlmrc config is pointing to the correct server otherwise the wrong server could be used!

Missing -power Option

If your submitting a POD job but it fails with the below error message, try adding -power to your sqsub command as shown in the above examples. This is because by default, the ccmpsuite type of license is assumed.

Checking license file: 1999@license3.sharcnet.ca
PoD database message : -50001 FAILED_UNCHARGED_CHECKOUT
Failed to get all licenses needed for this job. Asked for 1 licenses of ccmpsuite
MPI Application rank 0 exited before MPI_Finalize() with status 1
Terminated

Additional Options

Additional options, besides -power maybe appended to the sqsub or starccmplu-gui. A partial listing of these maybe found by running starccm+ -help as shown below. For example to start new simulation in parallel use starccmplu-gui -new -np 4 and so forth:

[roberpj@vdi-centos6:~] module load starccmplus/double/10.06.009

[roberpj@vdi-centos6:~] starccm+ -help

Usage: starccm+ [-server] [<options> ...] [<simfile>]

Where:
  -server                # Starts the STAR-CCM+ server. The default is to start the STAR-CCM+ client.

General options:
  -info                  # Prints information about the simulation file.
  -ini <file>            # Specify an .ini file to provide default starccm+ arguments.
  -loc                   # Prevent the server locator from starting.
  -new                   # Create a new simulation. If a simulation file is named and does not exist, it is created.
  -v, -vv, -vvv          # Verbose mode. Prints environment changes and subcommands.
  -version               # Print the version information and exit.
  -rsh <rsh command>     # Specify the remote shell command to use (default rsh).
  <simfile>              # Use the supplied simulation file (eg. star.sim).

License options:
  -icemeshing            # Start an IC Engine meshing session.
  -lite                  # Use lite session license (reduced functionality).
  -powerpre              # Use 10 DOEtoken licenses to enable meshing and pre/post
  -nosuite               # Do not check out <name>suite licenses for additional nodes.
  -tokensonly            # Only use DOEtoken licenses.
  -notokens              # Do not use DOEtoken licenses.
  -doe-prefer-hpcdomains # Use hpcdomains licenses before using DOEtoken licenses for optimization studies.
  -power                 # Use Power Session license option.
  -podkey  <value>       # Specify a PoD license key
  -doeuuid <value>       # Specify a DoE UUID
  -doepower              # Use Power Session license for an optimization session license
  -licpath <path:...>    # Specify a license path that overrides the default license path
  -reserve <lic1,...>    # Specify which add-on licenses to reserve when a simulation is created/restored
  -noreserve             # Specify that no add-on licenses will be reserved when a  simulation is created/restored
  -noretry               # Specify no retry of required licenses
  -norelease             # Specify no release of reader licenses after geometry import

References

o STARR-CCM+ Homepage
https://mdx.plm.automation.siemens.com/star-ccm-plus

o Power-Sessions and Power-on-Demand Licensing
https://mdx2.plm.automation.siemens.com/sites/default/files/flier/pdf/Siemens-PLM-STAR-CCM-power-licensing-fs-58546-A3.pdf

o PoD vs Flex Licensing
https://community.plm.automation.siemens.com/t5/Formula-Student-FSAE-Forum/STAR-CCM-How-does-the-licensing-work/td-p/415623

o Sci-Net STAR-CCM+ Wiki
https://docs.scinet.utoronto.ca/index.php/STAR-CCM%2B

o Hows to set up and use Power-on-Demand (PoD) licensing
https://thesteveportal.plm.automation.siemens.com/articles/en_US/Video/How-to-set-up-and-use-Power-on-Demand-licensing