From Documentation
Jump to: navigation, search
 
Line 1: Line 1:
 +
{{Template:CCDelete}}
 
{{Template:LegacyPage}}
 
{{Template:LegacyPage}}
  

Latest revision as of 10:33, 6 June 2019

This page is scheduled for deletion because it is either redundant with information available on the CC wiki, or the software is no longer supported.
Note: This page's content only applies to the SHARCNET's legacy systems; it does not apply to Graham.


NCBI C++ TOOLKIT
Description: Provides free, portable, public domain libraries with no restrictions use
SHARCNET Package information: see NCBI C++ TOOLKIT software page in web portal
Full list of SHARCNET supported software


Introduction

Provides the complete NCBIC++TOOLKIT on all sharcnet systems for command line (non-graphical) use.

Version Selection

module unload intel    <--- No longer necessary as its now done automatically !
module load ncbic++toolkit/gcc630/18.0.0

Job Submission

Jobs maybe submitted to the serial or threaded queue using standard sqsub commands. Using the below blast binary program as an example, where it should be noted that -n 4 must match -num_threads 4 as shown here:

sqsub -r 1h -q threaded -n 4 --mpp=1G -o ofile.%J blastp -query test.txt
   -db db/blast/sorted_env_nr -task blastp -out test.out  -evalue 0.001 -num_threads 4

Software Layout

Binary Programs

To see a listing of the available 754 binary programs run the ls command:

[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ls

For example, to see a listing of ONLY the blast binaries run:

[roberpj@orc-login1:~] cd /opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin

[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ls *blast*
blast_dataloader_unit_test  blastextend_unit_test   blastp
blastdb_aliastool           blastfilter_unit_test   blast_sample
blastdbcheck                blast_formatter         blast_services_unit_test
blastdbcmd                  blast_format_unit_test  blastsetup_unit_test
blastdbcp                   blasthits_unit_test     blastsrainput_unit_test
blastdb_format_unit_test    blastinput_demo         blast_tabular_unit_test
blast_demo                  blastinput_unit_test    blast_unit_test
blastdiag_unit_test         blastn                  blastx
blastengine_unit_test       blastoptions_unit_test

Demo Programs

There are many demo programs. To get help with the blast demo, run command:

[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ./blast_demo -help
USAGE
  blast_demo [-h] [-help] [-xmlhelp] -program ProgramName [-db DataBase]
    [-in Queryfile] [-out Outputfile] [-evalue evalue] [-penalty penalty]
    [-reward reward] [-matrix matrix] [-logfile File_Name]
    [-conffile File_Name] [-version] [-version-full] [-dryrun]

DESCRIPTION
   BLAST demo program

REQUIRED ARGUMENTS
 -program <String, `blastn', `blastp', `blastx', `dc-megablast', `megablast',
                   `rpsblast', `tblastn', `tblastx'>
   One of blastn, megablast, dc-megablast, blastp, blastx, tblastn, tblastx,
   rpsblast

OPTIONAL ARGUMENTS
 -h
   Print USAGE and DESCRIPTION;  ignore all other parameters
 -help
   Print USAGE, DESCRIPTION and ARGUMENTS; ignore all other parameters
 -xmlhelp
   Print USAGE, DESCRIPTION and ARGUMENTS in XML format; ignore all other
   parameters
 -db <String>
   This is the name of the database
   Default = `nr'
 -in <File_In>
   A file with the query
   Default = `stdin'
 -out <File_Out>
   The output file
   Default = `stdout'
 -evalue <Real>
   E-value threshold for saving hits
   Default = `0'
 -penalty <Integer>
   Penalty score for a mismatch
   Default = `0'
 -reward <Integer>
   Reward score for a match
   Default = `0'
 -matrix <String>
   Scoring matrix name
   Default = `BLOSUM62'
 -logfile <File_Out>
   File to which the program log should be redirected
 -conffile <File_In>
   Program's configuration (registry) data file
 -version
   Print version number;  ignore other arguments
 -version-full
   Print extended version data;  ignore other arguments
 -dryrun
   Dry run the application: do nothing, only test all preconditions

Example Job

Using a blast binary to demonstrate howto submit a typical ncbic++toolkit binary program to the threaded queue:

Step 1) Prepare blast database:

ssh saw.sharcnet.ca
ssh saw-dev1
wget http://mirrors.vbi.vt.edu/mirrors/ftp.ncbi.nih.gov/blast/db/FASTA/env_nr.gz
gunzip env_nr.gz
mkdir -p db/blast/
makeblastdb -in env_nr -dbtype prot -out db/blast/sorted_env_nr -max_file_sz 500MB
ls db/blast/sorted_env_nr.*.*
db/blast/sorted_env_nr.00.phr  db/blast/sorted_env_nr.01.phr  db/blast/sorted_env_nr.02.phr
db/blast/sorted_env_nr.00.pin  db/blast/sorted_env_nr.01.pin  db/blast/sorted_env_nr.02.pin
db/blast/sorted_env_nr.00.psq  db/blast/sorted_env_nr.01.psq  db/blast/sorted_env_nr.02.psq

Step 2) Create input file:

echo -e ">test\nGGG" >> test.txt
cat test.txt
>test
GGG

Step 3) Perform a query:

sqsub -r 1h -q threaded -n 4 --mpp=1G -o ofile.%J blastp -query test.txt
   -db db/blast/sorted_env_nr -task blastp -out test.out  -evalue 0.001 -num_threads 4

Step 4) Check results in output file test.out:

cat test.out

General Notes

Usage Help

To determine the arguments and options a given binary program accepts, use the -help switch on the command line. Using blastp for example purposes once again, the initial snippet of output is pasted below. Unfortunately there are no man pages present, however one can consult the online docs https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/ for similar information as the complete toolkit has been installed:

[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ./blastp -help
USAGE
  blastp [-h] [-help] [-import_search_strategy filename]
    [-export_search_strategy filename] [-task task_name] [-db database_name]
    [-dbsize num_letters] [-gilist filename] [-seqidlist filename]
    [-negative_gilist filename] [-entrez_query entrez_query]
    [-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm]
    [-subject subject_input_file] [-subject_loc range] [-query input_file]
    [-out output_file] [-evalue evalue] [-word_size int_value]
    [-gapopen open_penalty] [-gapextend extend_penalty]
    [-qcov_hsp_perc float_value] [-max_hsps int_value]
    [-xdrop_ungap float_value] [-xdrop_gap float_value]
    [-xdrop_gap_final float_value] [-searchsp int_value]
    [-sum_stats bool_value] [-seg SEG_options] [-soft_masking soft_masking]
    [-matrix matrix_name] [-threshold float_value] [-culling_limit int_value]
    [-best_hit_overhang float_value] [-best_hit_score_edge float_value]
    [-window_size int_value] [-lcase_masking] [-query_loc range]
    [-parse_deflines] [-outfmt format] [-show_gis]
    [-num_descriptions int_value] [-num_alignments int_value]
    [-line_length line_length] [-html] [-max_target_seqs num_sequences]
    [-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo]
    [-use_sw_tback] [-version]

DESCRIPTION
   Protein-Protein BLAST 2.5.1+

OPTIONAL ARGUMENTS
 -h
   Print USAGE and DESCRIPTION;  ignore all other parameters
 -help
   Print USAGE, DESCRIPTION and ARGUMENTS; ignore all other parameters
 -version
   Print version number;  ignore other arguments

 *** Input query options
 -query <File_In>
   Input file name
   Default = `-'
 -query_loc <String>
   Location on the query sequence in 1-based offsets (Format: start-stop)
etc
etc

References

o NCBI C++ Toolkit Homepage
http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/

o Public Access to the Source Code via FTP
https://ncbi.github.io/cxx-toolkit/pages/ch_getcode_svn#ch_getcode_svn.ftp_download