From Documentation
Revision as of 16:27, 20 August 2014 by Roberpj (Talk | contribs) (UNIGENE)

Jump to: navigation, search
MPIBLAST
Description: Parallel implementation of NCBI BLAST
SHARCNET Package information: see MPIBLAST software page in web portal
Full list of SHARCNET supported software


Introduction

The mpiblast module must be manually loaded before submitting any mpiblast jobs. Two examples below demonstrate how to setup and submit jobs to the mpi queue.

Version Selection

All Clusters (except Guppy)

module unload openmpi intel
module load intel/12.1.3 openmpi/intel/1.6.2 mpiblast/1.6.0

Guppy Only

module unload openmpi intel
module load intel/11.0.083 openmpi/intel/1.4.2 mpiblast/1.6.0

Sample Problems

DROSOPH

Copy sample problem files (fasta database and input) from the /opt/sharcnet examples directory a directory under work as shown here. The fasta database used in this example can be obtained as a guest from NCBI here http://www.ncbi.nlm.nih.gov/guide/all/#downloads_ then clicking "FTP: FASTA BLAST Databases".

mkdir -p /work/$USER/samples/mpiblast/test1
rm /work/$USER/samples/mpiblast/test1/*
cd /work/$USER/samples/mpiblast/test1
cp /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.in drosoph.in
gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.nt.gz > drosoph.nt

Create a hidden configuration file using a text editor (such as vi) to define a Shared storage location between nodes and a Local storage directory available on each compute node as follows:

[roberpj@hnd20:/work/$USER/samples/mpiblast/test1] vi .ncbirc
[NCBI]
Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data
[BLAST]
BLASTDB=/scratch/YourSharcnetUsername/mpiblasttest1
BLASTMAT=/work/YourSharcnetUsername/samples/mpiblast/test1
[mpiBLAST]
Shared=/scratch/YourSharcnetUsername/mpiblasttest1
Local=/tmp

Partition the database into 8 fragments in your current work test2 directory location:

module load mpiblast/1.6.0
mkdir /scratch/roberpj/mpiblasttest1
rm -f /scratch/$USER/mpiblasttest1/*
cd /work/$USER/samples/mpiblast/test1

[roberpj@hnd20:/work/$USER/samples/mpiblast/test1] mpiformatdb -N 8 -i drosoph.nt -o T -p F -n /scratch/$USER/mpiblasttest1
Reading input file
Done, read 1534943 lines
Breaking drosoph.nt into 8 fragments
Executing: formatdb -p F -i drosoph.nt -N 8 -n /scratch/roberpj/mpiblasttest1/drosoph.nt -o T 
Created 8 fragments.
<<< Please make sure the formatted database fragments are placed in /scratch/roberpj/mpiblasttest1/ before executing mpiblast. >>>

[roberpj@hnd20:/scratch/roberpj/mpiblasttest1] ls
drosoph.nt.000.nhr  drosoph.nt.002.nin  drosoph.nt.004.nnd  drosoph.nt.006.nni
drosoph.nt.000.nin  drosoph.nt.002.nnd  drosoph.nt.004.nni  drosoph.nt.006.nsd
drosoph.nt.000.nnd  drosoph.nt.002.nni  drosoph.nt.004.nsd  drosoph.nt.006.nsi
drosoph.nt.000.nni  drosoph.nt.002.nsd  drosoph.nt.004.nsi  drosoph.nt.006.nsq
drosoph.nt.000.nsd  drosoph.nt.002.nsi  drosoph.nt.004.nsq  drosoph.nt.007.nhr
drosoph.nt.000.nsi  drosoph.nt.002.nsq  drosoph.nt.005.nhr  drosoph.nt.007.nin
drosoph.nt.000.nsq  drosoph.nt.003.nhr  drosoph.nt.005.nin  drosoph.nt.007.nnd
drosoph.nt.001.nhr  drosoph.nt.003.nin  drosoph.nt.005.nnd  drosoph.nt.007.nni
drosoph.nt.001.nin  drosoph.nt.003.nnd  drosoph.nt.005.nni  drosoph.nt.007.nsd
drosoph.nt.001.nnd  drosoph.nt.003.nni  drosoph.nt.005.nsd  drosoph.nt.007.nsi
drosoph.nt.001.nni  drosoph.nt.003.nsd  drosoph.nt.005.nsi  drosoph.nt.007.nsq
drosoph.nt.001.nsd  drosoph.nt.003.nsi  drosoph.nt.005.nsq  drosoph.nt.dbs
drosoph.nt.001.nsi  drosoph.nt.003.nsq  drosoph.nt.006.nhr  drosoph.nt.mbf
drosoph.nt.001.nsq  drosoph.nt.004.nhr  drosoph.nt.006.nin  drosoph.nt.nal
drosoph.nt.002.nhr  drosoph.nt.004.nin  drosoph.nt.006.nnd

Submit a short job with a 15m time limit on 8 plus 2 cores. If all goes well output results will be written to drosoph.out and the execution time will appear in ofile%J where %J is the job number:

[roberpj@hnd20:/work/$USER/samples/mpiblast/test1]
sqsub -r 15m -n 10 -q mpi --mpp=1G -o ofile%J mpiblast -d drosoph.nt -i drosoph.in
    -p blastn -o drosoph.out --use-parallel-write --use-virtual-frags
submitted as jobid 6966896

[roberpj@hnd20:/work/roberpj/samples/mpiblast/test1] cat ofile6966896.hnd50 
Total Execution Time: 1.80031

When submitting a mpiblast job on a cluster such as goblin that doesnt have an inifiniband interconnect better performance (at least double speedup) will be achieved running the mpi job on one compute node. For regular users of non-contributed hardware typically specify "-n 8" to reflect the max number of cores on a single node:

sqsub -r 15m -n 8 -N 1 -q mpi --mpp=4G -o ofile%J mpiblast -d drosoph.nt -i drosoph.in
           -p blastn -o drosoph.out --use-parallel-write --use-virtual-frags

Sample output results computed previously with BLASTN 2.2.15 [Oct-15-2006] are included in /opt/sharcnet/mpiblast/1.6.0/examples/ROSOPH.out to compare your newly generated drosoph.out file with.

UNIGENE

The main purpose of this example is to illustrate some additional options and switchs that maybe useful for debugging and for dealing with larger databases as described in official detail at http://www.mpiblast.org/Docs/Guide. The fasta database used in this example can also be downloaded from http://www.ncbi.nlm.nih.gov/guide/all/#downloads_ as a guest by clicking "FTP: UniGene" then entering the "Homo_sapiens" sub-directory. More information about UniGene alignments can be found at https://cgwb.nci.nih.gov/cgi-bin/hgTrackUi?hgsid=95443&c=chr1&g=uniGene_3 . As with Example1 above, for convenience all required files can simply be copied from the /opt/sharcnet examples subdirectory to work as shown here:

mkdir /work/$USER/samples/mpiblast/test2
rm -f /work/$USER/samples/mpiblast/test2/*
cd /work/$USER/samples/mpiblast/test2
cp /opt/sharcnet/mpiblast/1.6.0/examples/il2ra.in il2ra.in
gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/Hs.seq.uniq.gz > Hs.seq.uniq

Create a hidden configuration file using a text editor (such as vi) to define a Shared storage location between nodes and a Local storage directory available on each compute node as followw. Note that the ncbi/data directory is not used in this example and hence can be omitted. If the Local and Shared directories are the same replace --copy-via=mpi with --copy-via=none as will be demonstrated in the below sqsub commands.

[username@orc-login1:/work/$USER/samples/mpiblast/test2] vi .ncbirc
[NCBI]
Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data
[BLAST]
BLASTDB=/work/YourSharcnetUsername/mpiblasttest2
BLASTMAT=/work/YourSharcnetUsername/samples/mpiblast/test2
[mpiBLAST]
Shared=/work/YourSharcnetUsername/mpiblasttest2
Local=/tmp

Partition the database into 16 fragments in your current work test2 directory location:

module load mpiblast/1.6.0
mkdir -p /work/$USER/mpiblasttest2
rm -f /work/$USER/mpiblasttest2/*
cd /work/$USER/samples/mpiblast/test2

[roberpj@orc-login1:/work/roberpj/samples/mpiblast/test2] mpiformatdb -N 16 -i Hs.seq.uniq -o T -p F -n /work/roberpj/mpiblasttest2
Reading input file
Done, read 2348651 lines
Breaking Hs.seq.uniq into 16 fragments
Executing: formatdb -p F -i Hs.seq.uniq -N 16 -o T 
Created 16 fragments.
<<< Please make sure the formatted database fragments are placed in /work/roberpj/mpiblasttest2/ before executing mpiblast. >>>

 [roberpj@hnd19:/work/roberpj/mpiblasttest2] ls
Hs.seq.uniq.000.nhr  Hs.seq.uniq.003.nnd  Hs.seq.uniq.006.nsd  Hs.seq.uniq.009.nsq  Hs.seq.uniq.013.nin
Hs.seq.uniq.000.nin  Hs.seq.uniq.003.nni  Hs.seq.uniq.006.nsi  Hs.seq.uniq.010.nhr  Hs.seq.uniq.013.nnd
Hs.seq.uniq.000.nnd  Hs.seq.uniq.003.nsd  Hs.seq.uniq.006.nsq  Hs.seq.uniq.010.nin  Hs.seq.uniq.013.nni
Hs.seq.uniq.000.nni  Hs.seq.uniq.003.nsi  Hs.seq.uniq.007.nhr  Hs.seq.uniq.010.nnd  Hs.seq.uniq.013.nsd
Hs.seq.uniq.000.nsd  Hs.seq.uniq.003.nsq  Hs.seq.uniq.007.nin  Hs.seq.uniq.010.nni  Hs.seq.uniq.013.nsi
Hs.seq.uniq.000.nsi  Hs.seq.uniq.004.nhr  Hs.seq.uniq.007.nnd  Hs.seq.uniq.010.nsd  Hs.seq.uniq.013.nsq
Hs.seq.uniq.000.nsq  Hs.seq.uniq.004.nin  Hs.seq.uniq.007.nni  Hs.seq.uniq.010.nsi  Hs.seq.uniq.014.nhr
Hs.seq.uniq.001.nhr  Hs.seq.uniq.004.nnd  Hs.seq.uniq.007.nsd  Hs.seq.uniq.010.nsq  Hs.seq.uniq.014.nin
Hs.seq.uniq.001.nin  Hs.seq.uniq.004.nni  Hs.seq.uniq.007.nsi  Hs.seq.uniq.011.nhr  Hs.seq.uniq.014.nnd
Hs.seq.uniq.001.nnd  Hs.seq.uniq.004.nsd  Hs.seq.uniq.007.nsq  Hs.seq.uniq.011.nin  Hs.seq.uniq.014.nni
Hs.seq.uniq.001.nni  Hs.seq.uniq.004.nsi  Hs.seq.uniq.008.nhr  Hs.seq.uniq.011.nnd  Hs.seq.uniq.014.nsd
Hs.seq.uniq.001.nsd  Hs.seq.uniq.004.nsq  Hs.seq.uniq.008.nin  Hs.seq.uniq.011.nni  Hs.seq.uniq.014.nsi
Hs.seq.uniq.001.nsi  Hs.seq.uniq.005.nhr  Hs.seq.uniq.008.nnd  Hs.seq.uniq.011.nsd  Hs.seq.uniq.014.nsq
Hs.seq.uniq.001.nsq  Hs.seq.uniq.005.nin  Hs.seq.uniq.008.nni  Hs.seq.uniq.011.nsi  Hs.seq.uniq.015.nhr
Hs.seq.uniq.002.nhr  Hs.seq.uniq.005.nnd  Hs.seq.uniq.008.nsd  Hs.seq.uniq.011.nsq  Hs.seq.uniq.015.nin
Hs.seq.uniq.002.nin  Hs.seq.uniq.005.nni  Hs.seq.uniq.008.nsi  Hs.seq.uniq.012.nhr  Hs.seq.uniq.015.nnd
Hs.seq.uniq.002.nnd  Hs.seq.uniq.005.nsd  Hs.seq.uniq.008.nsq  Hs.seq.uniq.012.nin  Hs.seq.uniq.015.nni
Hs.seq.uniq.002.nni  Hs.seq.uniq.005.nsi  Hs.seq.uniq.009.nhr  Hs.seq.uniq.012.nnd  Hs.seq.uniq.015.nsd
Hs.seq.uniq.002.nsd  Hs.seq.uniq.005.nsq  Hs.seq.uniq.009.nin  Hs.seq.uniq.012.nni  Hs.seq.uniq.015.nsi
Hs.seq.uniq.002.nsi  Hs.seq.uniq.006.nhr  Hs.seq.uniq.009.nnd  Hs.seq.uniq.012.nsd  Hs.seq.uniq.015.nsq
Hs.seq.uniq.002.nsq  Hs.seq.uniq.006.nin  Hs.seq.uniq.009.nni  Hs.seq.uniq.012.nsi  Hs.seq.uniq.dbs
Hs.seq.uniq.003.nhr  Hs.seq.uniq.006.nnd  Hs.seq.uniq.009.nsd  Hs.seq.uniq.012.nsq  Hs.seq.uniq.mbf
Hs.seq.uniq.003.nin  Hs.seq.uniq.006.nni  Hs.seq.uniq.009.nsi  Hs.seq.uniq.013.nhr  Hs.seq.uniq.nal                   

Submit a couple of short jobs 15m time limit. If all goes well output results will be written to biobrewA.out and biobrewB.out and the execution time appear in corresponding ofile%J's where %J is the job number as per usual:

A) Usage of the profile time option is shown:

[roberpj@hnd19:/work/roberpj/samples/mpiblast/test2] rm -f oTime*;   
    sqsub -r 15m -n 18 -q mpi -o ofile%J mpiblast  --use-parallel-write --copy-via=mpi
       -d Hs.seq.uniq -i il2ra.in -p blastn -o biobrew.out --time-profile=oTime

B) Usage of the debug option is also shown:

[roberpj@hnd19:/work/roberpj/samples/mpiblast/test2] rm -f oLog*;
    sqsub -r 15m -n 18 -q mpi -o ofile%J mpiblast --use-parallel-write --copy-via=none
        -d Hs.seq.uniq -i il2ra.in -p blastn -o biobrew.out --debug=oLog

Finally compare /opt/sharcnet/mpiblast/1.6.0/examples/BIOBREW.out computed previously with BLASTN 2.2.15 [Oct-15-2006] with your newly generated biobrew.out output file to verify the results and submit a ticket if there are any problems!

SUPPORTED PROGRAMS IN MPIBLAST

As described in http://www.mpiblast.org/Docs/FAQ mpiblast supports the standard blast programs http://www.ncbi.nlm.nih.gov/BLAST/blast_program.shtml which are reproduced here for reference:

blastp:     Compares an amino acid query sequence against a protein sequence database.
blastn:     Compares a nucleotide query sequence against a nucleotide sequence database.
blastx:     Compares a nucleotide query sequence translated in all reading frames against
            a protein sequence database.
tblastn:    Compares a protein query sequence against a nucleotide sequence database
            dynamically translated in all reading frames.
tblastx:    Compares the six-frame translations of a nucleotide query sequence against
            the six-frame translations of a nucleotide sequence database.

MPIBLAST BINARIES OPTIONS

[roberpj@orc-login1:/opt/sharcnet/mpiblast/1.6.0/bin] ./mpiblast -help
mpiBLAST requires the following options: -d [database] -i [query file] -p [blast program name]
[roberpj@orc-login1:/opt/sharcnet/mpiblast/1.6.0/bin] ./mpiformatdb --help
Executing: formatdb - 

formatdb 2.2.20   arguments:

  -t  Title for database file [String]  Optional
  -i  Input file(s) for formatting [File In]  Optional
  -l  Logfile name: [File Out]  Optional
    default = formatdb.log
  -p  Type of file
         T - protein   
         F - nucleotide [T/F]  Optional
    default = T
  -o  Parse options
         T - True: Parse SeqId and create indexes.
         F - False: Do not parse SeqId. Do not create indexes.
 [T/F]  Optional
    default = F
  -a  Input file is database in ASN.1 format (otherwise FASTA is expected)
         T - True, 
         F - False.
 [T/F]  Optional
    default = F
  -b  ASN.1 database in binary mode
         T - binary, 
         F - text mode.
 [T/F]  Optional
    default = F
  -e  Input is a Seq-entry [T/F]  Optional
    default = F
  -n  Base name for BLAST files [String]  Optional
  -v  Database volume size in millions of letters [Integer]  Optional
    default = 4000
  -s  Create indexes limited only to accessions - sparse [T/F]  Optional
    default = F
  -V  Verbose: check for non-unique string ids in the database [T/F]  Optional
    default = F
  -L  Create an alias file with this name
        use the gifile arg (below) if set to calculate db size
        use the BLAST db specified with -i (above) [File Out]  Optional
  -F  Gifile (file containing list of gi's) [File In]  Optional
  -B  Binary Gifile produced from the Gifile specified above [File Out]  Optional
  -T  Taxid file to set the taxonomy ids in ASN.1 deflines [File In]  Optional
  -N  Number of database volumes [Integer]  Optional
    default = 0

General Notes

Issue: Job runs for a while then dies.

The solution here is to filter the input sequence file. For reasons yet understood the presence of repeat sections results in many thousands of WARNING and ERROR messages rapidly written to the "sqsub -o ofile" output file presumably as mpiblast ignores sequences before eventually diing after several hours, or possibly days.

# cat  ofile1635556.saw-admin.saw.sharcnet | grep "WARNING\|ERROR" | wc -l
10560
# cat ofile1635556.saw-admin.saw.sharcnet
Selenocysteine (U) at position 60 replaced by X
Selenocysteine (U) at position 42 replaced by X
[blastall] WARNING:  [000.000]  NODE_84_length_162_cov_46.259258_1_192_-: SetUpBlastSearch failed.
[blastall] ERROR:  [000.000]  NODE_84_length_162_cov_46.259258_1_192_-: BLASTSetUpSearch: Unable to calculate Karlin-Altschul params, check query sequence
<<<< snipped out ~10000 similar WARNING and ERROR messages from this example >>>>
[blastall] WARNING:  [000.000]  NODE_65409_length_87_cov_2.367816_1_77_+: SetUpBlastSearch failed.
[blastall] ERROR:  [000.000]  NODE_65409_length_87_cov_2.367816_1_77_+: BLASTSetUpSearch: Unable to calculate Karlin-Altschul params, check query sequence
Selenocysteine (U) at position 61 replaced by X
Selenocysteine (U) at position 62 replaced by X
Selenocysteine (U) at position 34 replaced by X
Selenocysteine (U) at position 1058 replaced by X
--------------------------------------------------------------------------
mpirun noticed that process rank 52 with PID 30067 on node saw214 exited on signal 9 (Killed).
--------------------------------------------------------------------------
1618    17      651     62909.3 Bailing out with signal 15
14      3       62909.3 Bailing out with signal 15
19      15      62909.3 Bailing out with signal 1536    5       7       62909.312       62909.310       62909.3 Bailing out with signal 1547    21      62909.3 Bailing out with signal 159  62909.3 Bailing out with signal 15
45      62909.3 Bailing out with signal 1525    62909.3 Bailing out with signal 158     62909.3 Bailing out with signal 15
50      62909.3 Bailing out with signal 1522    62909.3 Bailing out with signal 15
11      62909.3 Bailing out with signal 15
48      62909.323       62909.3 Bailing out with signal 15
        Bailing out with signal 15
46      62909.3 Bailing out with signal 15
20      62909.3 Bailing out with signal 15
        Bailing out with signal 15
51      62909.3 Bailing out with signal 15
26      62909.3 Bailing out with signal 15

28      96
40      27      108
114     38      62909.3 Bailing out with signal 1530    102     4431    39      62909.3 Bailing out with signal 15107   43      34      85      103     5932    88  109      105     62909.3 Bailing out with signal 1586    104     62909.3 Bailing out with signal 1129062909.3    Bailing out with signal 15120   117     62118   62909.3      Bailing out with signal 15116   62909.3 Bailing out with signal 15
0       62909.3 Bailing out with signal 15
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 0.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

References

o MPIBLAST Homepage
http://www.mpiblast.org/

o MPIBLAST Version History
http://www.mpiblast.org/Downloads/Version-History