|Description: Suite of programs that allow users to carry out fluid flow simulation|
|SHARCNET Package information: see ANSYS software page in web portal|
|Full list of SHARCNET supported software|
A) INTERACTIVE USAGE ON SHARCNET VISUALIZATION WORKSTATIONS
Like the SHARCNET clusters, each SHARCNET visualization workstation provides users with three different locations for storing and accessing files. These locations are /home (1GB quota, no time limit, backed up), /work (1TB quote, no time limit, not backed up), /scratch (no quota, 4 month time limit, not backed up). Note that fluent should never be used to access simulation files under /home since doing this a) stresses the home account file server b) is severely space limited and c) is generally much slower than /work or /scratch.
Its recommended that simulation files be transferred from work or home to scratch on the viz stations to get best performance using them with any of the ansys gui applications.
If you are connecting from a Windows pc we reccomend using either:
1) <a href='https://www.sharcnet.ca/help/index.php/Remote_X_Connections_to_Sharcnet#NX_Client'>nxclient</a>
2) <a href='https://www.sharcnet.ca/help/index.php/Remote_X_Connections_to_Sharcnet#Xming_Software'>xming</a>
3) <a href='http://x.cygwin.com/docs/ug/using-remote-apps.html#using-remote-apps-ssh'>cygwin</a>
If you are connecting from a linux pc then use simply:
4) <a href='http://linux.about.com/od/commands/l/blcmdl1_ssh.htm'>ssh</a> -Y username@vizN-site.sharcnet.ca
Once connected locally (sitting at the terminal) or remotely (by nxclient, xming, cygwin or ssh) the ANSYS fluent, cfx or icemcfd applications can be started using one of the following commands (noting that icemcfd will not work over a xming/putty connection):
module load ansys/13.0 cd directory/containing/simulation/files fluent-gui cfx-gui icemcfd-gui
Visualization Workstations Supporting Remote Connections
1) viz1-wlu.sharcnet.ca 16GB Memory - Quad Core (nxclient access)
2) viz2-wlu.sharcnet.ca 16GB Memory - Quad Core (nxclient access)
3) viz1-brocku.sharcnet.ca 16GB Memory - Quad Core (nxclient access)
4) viz2-brocku.sharcnet.ca 16GB Memory - Quad Core (nxclient access)
Servers Supporting Remote Connections
1) tope.sharcnet.ca 32GB Memory - Dual Quad Core (nxclient access)
2) school.sharcnet.ca 8GB Memory - Dual Quad Core (ssh -X access)
B) USING CFX ON SHARCNET CLUSTERS
<a name="cfxsubjobs">SUBMITTING CFX JOBS TO THE QUEUE</a>
The values "-n 1", "-n 8", "-n 24", "-N 1" wherever shown in the following sqsub commands must remain fixed to ensure optimal performance. Only the variable "ncpus" can safely be varied. For example to run jobs with "-n 4" on a sharcnet only HOUND should be used. Or for example "-n 24" size jobs should only be run on Orca.
module switch ansys/13.0 ORCA/REDFIN (parallel): sqsub -t -r 1h --nompirun -q mpi -n 24 -N 1 -o ofile.%J cfx HydrofoilGrid.def HOUND (parallel): sqsub -t -r 1h --nompirun -q mpi -n ncpus -N 1 -o ofile.%J cfx HydrofoilGrid.def - OR - sqsub -t -r 1h --nompirun -q mpi -n ncpus -N 1 -o ofile.%J -f xeon cfx HydrofoilGrid.def - OR - sqsub -t -r 1h --nompirun -q mpi -n ncpus -N 1 -o ofile.%J -f opteron cfx HydrofoilGrid.def SAW/MAKO (parallel where -t should be ommitted for mako): sqsub -t -r 1h --nompirun -q mpi -n 8 -N 1 -o ofile.%J cfx HydrofoilGrid.def SILKY (parallel): sqsub -t -r 1h --nompirun -q mpi -n ncpus -N 1 -o ofile.%J cfx HydrofoilGrid.def ORCA/REDFIN/HOUND/SAW/MAKO/SILKY(serial): sqsub -t -r 1h -q serial -n 1 -o ofile.%J cfx HydrofoilGrid.def
The above sqsub submit commands were tested with Ansys Hydrofoil example which can be retrieved by doind the following copy command:
[roberpj@orca:~/testing/cfx] cp /opt/sharcnet/ansys/13.0/v130/CFX/examples/Hydrofoil* . [roberpj@orca:~/testing/cfx] ls HydrofoilExperimentalCp.csv HydrofoilGrid.def HydrofoilIni_001.res HydrofoilIni.pre Hydrofoil.pre
<a name="CFX_USAGE_HINTS">CFX USAGE HINTS</a>
1) ENABLING DOUBLE PRECISION
If appending the -double switch to the sqsub command line when submitting jobs does not work try manual configuration by opening CFX-Pre GUI and doing the following:
Execution Control ---> Run Definition ---> Executable Selection ---> Double Precision
2) PERIODICAL BACKUPS with CFX-GUI
Output Control ---> Backup ---> Add new item ---> Output Frequency
3) SPECIFY RESTART FILE from COMMAND LINE
sqsub -r 1h --nompirun -q mpi -n 4 -N 1 -o ofile.%J cfx HydrofoilGrid.def -continue-from-file HydrofoilIni_002.res
C) FLUENT USAGE INSTRUCTIONS ON SHARCNET CLUSTERS
<a name="core">HOWTO SUBMIT FLUENT JOBS</a>
Before a fluent job can be submitted to the queue on a SHARCNET cluster, the following are required: i) an initialized dat file ii) proven functional cas file iii) corresponding journal file.
These files should be placed into a directory under /scratch/username or /work/username. From this directory you may then submit your jobs as described in step 2) or 3) below.
Note that fluent jobs should never be submitted from a directory located under your home account (/home/username) since this file system has a 200MB quota and has extremely slow performance characteristics compared to /scratch or /work !
As of October 2009 a new command "ansysstat" is available on all clusters which shows total number of fluent licenses that are checked out from the central license server. The SHARCNET license allows a total of 25 jobs to run at one time. If you submit a job to the queue but all ansys licenses are already in use (when your jobs attempts to run) then it will sit idle (though appear running according to sqjobs) until sufficient ansys license are available for up to 24hours before being killed - in such cases you will see the error message Fatal error has happended to some of the processes! Exiting.
o JOB SUBMISSION SYNTAX
To submit a short test job to the serial queue (requiring only 1 processor) :
sqsub --test -r 1h -q serial -o ofile.%J fluent Model sample.jou
To submit a parallel fluent job to the mpi queue (requiring 2 or more processors) :
sqsub -r 1h --nompirun -q mpi -n nCpus -o ofile.%J fluent Model sample.jou
where nCpus = 2, 4, 8, 16, 32, 64
where Model = 2d, 2ddp, 3d, 3ddp
where --test = dashdashtest (not supported on mako)
where --nompirun = dashdashnompirun
The SHARCNET Fluent license allows up to 8 jobs to run at once using up to 128cpus in total. Note that the majority of clusters at SHARCNET where fluent is installed have 2GB/core therefore if memory is a concern choose the number of cpus (cores) by dividing your total memory needs by two roughly speaking.
o KRAKEN SPECIFIC
o When submitting parallel jobs to kraken the "-f myri" option must be used for instance:
sqsub -r 1h --nompirun -q mpi -n 8 -f myri -o ofile8.%J fluent Model sample.jou
o HOUND SPECIFIC
Due to the fat node design of hound much greater memory per processor (mmp) than the default 2gb per core is possible. For example to run a serial fluent requiring 64GB memory specify:
sqsub --test -r 1h -q serial --mpp=64G -o ofile.%J fluent Model sample.jou
Or to run a 8cpu parallel fluent job needing 64GB memory total do:
sqsub -r 1h --nompirun -q mpi -n 8 --mpp=8G -o ofile.%J fluent Model sample.jou
Aside: Its theoretically possible to request a maximum or 2TB (2048 GB) of memory on hound by specifying "-n 128 --mpp=16G" however this would require very special arrangements since the whole ansys license would be consumed !!!
For researchers belonging to the com3 group add "-f com3" or "-f xeon,com3" to sqsub:
sqsub -r 7d --nompirun -q mpi -n nCpus -f xeon,com3 -o ofile.%J fluent Model sample.jou
o BATCH FILE HINTS
o Select Print (to watch residuals: tail -f ofileJOB#)
o Deselect Plot (since batch jobs do not provide graphics)
o Set Case File Frequency = 0
o Specify Filename = itrFileName (cleanup afterwards: rm -f itr*.dat)
o Set Data File Frequency = 1000 (save itrFileName.dat ~once/day)
o RECOMMENDED SCALING TEST PROCEDURE
The optimal choice of nCpu for a fluent case will typically be very "machine dependant". Therefore before running a long fluent job it is recommended that a few short tests cases be run through the test queue to investigate Wall Clock times required for job completion. Generally expect to find that nCpus Optimal = Total Memory In Giga Bytes.
The test queue is ideal for scaling tests since it allows jobs to start running nearly immediately. However such test jobs must finish within the 60 minute time limit of the queue, likely one to ten timesteps can be run for most cases. From results of such experimentation one should be able gain insight to help choose the amount of parallelization. For example:
a) First determine wallclock time for 1cpu :
sqsub --test -r 1h -q serial -o ofile%J fluent 2d sample.jou
b) Next determine wallclock time for 2cpu :
sqsub --test -r 1h --nompirun -q mpi -n 2 -o ofile%J fluent 2d sample.jou
c) Next determine wallclock time for 4cpu :
sqsub --test -r 1h --nompirun -q mpi -n 4 -o ofile%J fluent 2d sample.jou
d) Continue to investigate nCpus = 8, 16, 24, 32.
Typically small 2d fluent cases requiring memory of less than 200MB will complete fastest with nCpus chosen = 1 or 2. While larger single precision (2d or 3d) or double precision (2ddp or 3ddp) will be more likely to benefit from nCpu > 2. The CPU Time value printed at the bottom of the queue output file (ofileJOB#) upon successful completion of a job, and should not be confused with WallClock time when determining the optimal choice of the nCpus parameter, else extremely inefficient usage of the resources and hence large amounts of wasted compute time could result! Clusters with GigE interconnects are particularly strong candidates for this behaviour where communication overhead can quickly dominate driving down %CPU utilization.
The following table gives an example of this point for a small test case where n=2 is shown to be the optimal choice for full runs. Its worth mentioning that choosing n=1 for this this example would free the second processor and second fluent license for others to use while only costly about 15% extra wait time for the job to complete.
|n=1||106.61 sec.||107 sec||98 sec||98.6|
|n=2||47.00 sec.||92 sec||63 sec||82.2|
|n=4||54.00 sec.||107 sec||53 sec||58.6|
|n=8||70.00 sec.||186 sec||62 sec||43.2|
|n=16||77.00 sec.||559 sec||69 sec||14.6|
o SAMPLE JOURNAL FILES
; 1) STEADY PROBLEM - INTERPRETED UDF ; =================================== ; Perform Following Define Modifications to Cas File and Save ; Define --> User-Defined --> Functions --> Interpreted --> Unset File Name define/user-defined/interpreted-functions "myudf.c" "cpp" 10000 no file/read-cas-data CasDatFileName file/auto-save/case-frequency if-case-is-modified file/auto-save/data-frequency 100 (save numbered dat file every 100 iterations) file/auto-save/root-name itrCasDatFileName (file format will be itrCasDatFileName-####.dat) file/confirm-overwrite y solve/iterate 1000 file/write-data finCasDatFileName exit yes
; 2) UNSTEADY PROBLEM - COMPILED UDF (x86_64 systems,3d model) ; =========================================================== ; Perform following cas file modification using fluent-gui and save. ; Define --> User-Defined --> Functions --> Interpreted --> (unset Source File Name) ; Define --> User-Defined --> Functions --> Compiled --> (add/build or load libudf) ; Define --> User-Defined --> Functions --> Manage --> (unload all UDF bibraries) ; Define --> User-defined --> Functions --> Manage --> Load (type libudf then click Load) define/user-defined/compiled-functions load "libudf/lnamd64/3d/libudf.so" file/read-cas-data CasDatFileName file/auto-save/case-frequency if-mesh-is-modified file/auto-save/data-frequency 100 (save numbered dat file every 100 time steps) file/auto-save/root-name itrCasDatFileName (file format will be itrCasDatFileName-####.dat) file/confirm-overwrite y solve/set/time-step 0.001 solve/dual-time-iterate 1000 40 file/write-data finCasDatFileName exit yes
o number physical time steps specified = 1000
o maximum number iterations per time step = 40
o time step in secs over rides cas file = 0.001
o frequency to save dat file solution = 100
BOUNDARY CONDITION CHECK
Assuming boundary conditions were defined using interpreted UDF(s) during the case file construction/setup then its worthwhile to verify such conditions are still defined before running large jobs. This can be done by first interpreting the UDF then second reading in the corresponding cas file using "fluent-gui" and then inspect Define ===> Boundary Condition ===> Click inlet ===> Click Set ===> Velocity Magnitide and these should reference the udf profile velocity profiles defined such as ie) udf YourProfileName. If this procedure is done with compiled UDF one will find the boundary conditions are not defined unless either interpreted first or hooked in. The former approach (work around) is used in the Compiled UDF example journal file above.
CREATING A COMPILED UDF ON SHARCNET
The recommended procedure to create a compiled UDF on SHARCNET involves using fluent-gui as follows. First cd into the directory where your cas and dat files are located, then remove or rename the libudf subdirectory if it exists. Next start fluent-gui selecting the appropriate precision and dimensionality such as 3d and single precision [optionally now read in your cas file] and then from the pull down menu click Define --> User-Defined --> Functions --> Compiled.
On the left side of the Compiled UDFs popup box click Add highlight your UDF file (such as myudf.c) then click OK and finally click Build. Assuming the Library Name field said libudf then a library file named libudf/lnamd64/3d/libudf.so should have been created. If you were working on a Itanium based SMP system such as Silky, then libudf.so will have created under libudf/lnia64/3d.
At this point, if you wish to continue working interactively click Load in the Compiled UDFs popup box then read in your dat file otherwise exit fluent-gui and setup your journal file appropriately before submitting a job to the queue after ensuring the Source File Name field located under Define --> User-Defined --> Functions --> Interpreted is empty.
PARALLEL VS SERIAL UDF
If you see the follow message when launching a parallel fluent case then the UDF you are using is not designed for parallel processing. If this is the case, restart your fluent job in serial mode.
Primitive Error at Node 1: open_udf_library: No such file or directory Primitive Error at Node 2: open_udf_library: No such file or directory Primitive Error at Node 3: open_udf_library: No such file or directory Primitive Error at Node 4: pen_udf_library: No such file or directory
o FILE CORRUPTION ISSUES
Text files created on windows machines which are not transferred properly onto SHARCNET (via sftp in ascii mode) may contain control-M (DOS end of line markers) or control-Z (DOS end of file marker) codes. Batch jobs with journal files effected by this will terminate for no apparent reason immediately after starting. The error messages which appear in the ofile# due to such a problem can vary. For instance UDF files will fail to load and compile resulting in an error message in the output such as Error: yourUDFfile.c: line 1: syntax erro followed by hundreds of messages such as Error: chip-exc: function XYZ not found. Or you might see a error messages such as Error: eval: unbound variable or invalid integer and so forth.
The presence of these problematic DOS codes can be detected by opening your file (typically your journal file) on SHARCNET into a text editor such as vi or nano then scanning for a messages respectively such as [ Read 6 lines (Converted from DOS format) ] or [dos]. To convert the file to a unix compatible format run the following command, then reopen it again to verify such dos conversion no longer takes place, following which your simulation should now run.
o JOB KILL VMEM EXCEEDED LIMIT
o Some parallel fluent jobs will need fairly large VIRT memory per process, as shown by the tope command, compared to is real RES memory allocation per process. For instance the Bramble Altix has 124gb main memory (plus 72gb swap). When a 8cpu job is started top displays the following briefly then the error message is printed to the output ofile.
[roberpj@bramble:~] sqsub -r 1h --nompirun -q mpi -n 8 -o ofile8cpu fluent 3d test.jou [roberpj@bramble:~] top Mem: 125714M total, 30659M used, 95055M free, 0M buffers Swap: 76294M total, 0M used, 76294M free, 22150M cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92820 roberpj 20 0 31.9g 84m 13m R 100 0.1 0:12.80 fluent_mpi.12.1 92821 roberpj 20 0 31.9g 84m 13m R 100 0.1 0:12.79 fluent_mpi.12.1 92815 roberpj 18 0 31.9g 88m 14m R 100 0.1 0:06.66 fluent_mpi.12.1 92816 roberpj 20 0 31.9g 90m 13m R 100 0.1 0:12.70 fluent_mpi.12.1 92818 roberpj 20 0 31.9g 82m 14m R 100 0.1 0:12.76 fluent_mpi.12.1 92819 roberpj 25 0 31.9g 85m 13m R 100 0.1 0:12.77 fluent_mpi.12.1 92822 roberpj 20 0 31.9g 88m 14m R 100 0.1 0:12.80 fluent_mpi.12.1 92817 roberpj 18 0 31.9g 85m 13m R 100 0.1 0:12.76 fluent_mpi.12.1 [roberpj@bramble:~] cat ofile8cpu | grep vmem auto partitioning mesh by Principal Axes=>> PBS: job killed: vmem 308727316480 exceeded limit 133143986176
o This problem can be mitigated by reducing the number of processors from 8 to 4 which results in a corresponding 50% decrease in VIRT per processor bringing the total from 8x32G=256G=~308727316480 down to a manageable 4x16.2G=64.8G and hence the job runs without tripping the vmem error message this time. Another solution is to use the Silky Altix has double the memory is configured to support much larger jobs.
[roberpj@bramble:~] sqsub -r 1h --nompirun -q mpi -n 4 -o ofile4cpu fluent 3d test.jou [roberpj@bramble:~] top Mem: 125714M total, 30824M used, 94890M free, 0M buffers Swap: 76294M total, 0M used, 76294M free, 22150M cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 90690 roberpj 25 0 16.2g 200m 16m R 100 0.2 10:59.83 fluent_mpi.12.1 90691 roberpj 25 0 16.2g 207m 16m R 100 0.2 11:00.20 fluent_mpi.12.1 90692 roberpj 25 0 16.2g 200m 16m R 100 0.2 11:00.34 fluent_mpi.12.1 90689 roberpj 18 0 16.2g 210m 16m R 100 0.2 10:45.82 fluent_mpi.12.1
o CHIP-EXEC FUNCTION NOT FOUND
If you see the following message regarding missing routines from your UDF (for instance when starting a 4cpu parallel job) it means that you have likely read your data file before executing either a interpreted-functions or compiled-functions line in your journal file. As shown in the two examples above the UDF are read first to avoid this, inotherwords order is important.
Reading "CasDatFileName.dat"... Error: chip-exec: function "inletprofile" not found. Error: chip-exec: function "intelprofile" not found. Error: chip-exec: function "inletprofile" not found. Error: chip-exec: function "inletprofile" not found. Done.
o CPP NOT FOUND FOR INTERPRETED UDF
This error may appear on Altix systems (such as Silky) if so define explicitly /usr/bin/cpp in the field Define --> User-Defined --> Functions --> Interpreted --> CPP Command Name using fluent-gui, save cas file, then re-submit your job through the queue using sqsub.
o HOWTO MONITOR SOLUTION RESIDUALS
Its possible to watch the residuals as your fluent solution progresses while being run in the (text based) batch queue. To do this simply do: tail -f ofileJID# where JID# is the LSF job number assigned by the queue when you submit a job. You can also use the sqjobs command to check the JID# of running or queued jobs.
CHANGES IN COMMANDS FROM FLUENT 6 TO FLUENT 12.1
As a warning the online documentation is out of date for certain commands. For instance the only way to know the correct arguments for case-frequency is to specify the following TUI command from within fluent-gui:
> file auto-save case-frequency (enter) When the data file is saved, save the case> (enter) each-time if-case-is-modified if-mesh-is-modified
> file auto-save data-frequency frequency to write data files (iterations) 
A listing of all available commands can be found by doing:
> help Type "?<command>" for help on a specific command. Type "?" to enter help mode, and "q" to exit help mode. The commands below are available in this menu. Those followed by "*" are currently disabled. adapt/ file/ solve/ close-fluent* mesh/ surface/ define/ parallel/ turbo/* display/ plot/ views/ exit report/
Solve --> Run Calculation --> Check Case Recommendation Maximum cell skewness is greater than 0.98, consider improving the mesh quality before proceeding with your simulation.
GRAPHICS and ANIMATION RESOURCES
o <a href="https://www.sharcnet.ca/Software/Fluent12/html/ug/node775.htm"> Chapter 26: Using the Solver</a> (see Animating the Solution)
o <a href="https://www.sharcnet.ca/Software/Fluent12/html/ug/node889.htm"> Chapter 29: Displaying Graphics</a> (see Animating Graphics)
o <a href="https://www.sharcnet.ca/Software/Fluent12/html/ug/node1011.htm"> Chapter 33: Task Page Reference Guide</a> (see Graphics and Animations Task Page
o <a href='https://www.sharcnet.ca/Software/Fluent13/help/flu_ug/flu_ug_chp_solve.html'> Chapter 28: Using the Solver</a> (see Animating the Solution)
o <a href='https://www.sharcnet.ca/Software/Fluent13/help/flu_ug/flu_ug_sec_graphics_animate.html'> Chapter 31: Displaying Graphics</a> (see Animating Graphics)
o <a href='https://www.sharcnet.ca/Software/Fluent13/help/flu_ug/flu_ug_rg_task_page.html'> Chapter 35: Task Page Reference Guide</a> (see Graphics and Animations Task Page)
If you have any questions about using Fluent on SHARCNET please submit a ticket to the problem tracker by clicking <a href='https://www.sharcnet.ca/my/problems/submit'>here</a> being sure to include queue job numbers of failed jobs and which cluster you are referring to. Also before submitting a ticket ensure the below "Check Case" command does not report any problems with the mesh quality and so forth - an example of a warning message that should be cleaned up before submitting a ticket follows:
Solve --> Run Calculation --> Check Case *** Recommendation Maximum cell skewness is greater than 0.98, consider ** *** improving the mesh quality before proceeding with your simulation. ****