Line 1: | Line 1: | ||
+ | {{Template:CCDelete}} | ||
{{Template:LegacyPage}} | {{Template:LegacyPage}} | ||
Latest revision as of 09:33, 6 June 2019
This page is scheduled for deletion because it is either redundant with information available on the CC wiki, or the software is no longer supported. |
Contents
Note: This page's content only applies to the SHARCNET's legacy systems; it does not apply to Graham. |
NCBI C++ TOOLKIT |
---|
Description: Provides free, portable, public domain libraries with no restrictions use |
SHARCNET Package information: see NCBI C++ TOOLKIT software page in web portal |
Full list of SHARCNET supported software |
Introduction
Provides the complete NCBIC++TOOLKIT on all sharcnet systems for command line (non-graphical) use.
Version Selection
module unload intel <--- No longer necessary as its now done automatically ! module load ncbic++toolkit/gcc630/18.0.0
Job Submission
Jobs maybe submitted to the serial or threaded queue using standard sqsub commands. Using the below blast binary program as an example, where it should be noted that -n 4 must match -num_threads 4 as shown here:
sqsub -r 1h -q threaded -n 4 --mpp=1G -o ofile.%J blastp -query test.txt -db db/blast/sorted_env_nr -task blastp -out test.out -evalue 0.001 -num_threads 4
Software Layout
Binary Programs
To see a listing of the available 754 binary programs run the ls command:
[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ls
For example, to see a listing of ONLY the blast binaries run:
[roberpj@orc-login1:~] cd /opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin [roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ls *blast* blast_dataloader_unit_test blastextend_unit_test blastp blastdb_aliastool blastfilter_unit_test blast_sample blastdbcheck blast_formatter blast_services_unit_test blastdbcmd blast_format_unit_test blastsetup_unit_test blastdbcp blasthits_unit_test blastsrainput_unit_test blastdb_format_unit_test blastinput_demo blast_tabular_unit_test blast_demo blastinput_unit_test blast_unit_test blastdiag_unit_test blastn blastx blastengine_unit_test blastoptions_unit_test
Demo Programs
There are many demo programs. To get help with the blast demo, run command:
[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ./blast_demo -help USAGE blast_demo [-h] [-help] [-xmlhelp] -program ProgramName [-db DataBase] [-in Queryfile] [-out Outputfile] [-evalue evalue] [-penalty penalty] [-reward reward] [-matrix matrix] [-logfile File_Name] [-conffile File_Name] [-version] [-version-full] [-dryrun] DESCRIPTION BLAST demo program REQUIRED ARGUMENTS -program <String, `blastn', `blastp', `blastx', `dc-megablast', `megablast', `rpsblast', `tblastn', `tblastx'> One of blastn, megablast, dc-megablast, blastp, blastx, tblastn, tblastx, rpsblast OPTIONAL ARGUMENTS -h Print USAGE and DESCRIPTION; ignore all other parameters -help Print USAGE, DESCRIPTION and ARGUMENTS; ignore all other parameters -xmlhelp Print USAGE, DESCRIPTION and ARGUMENTS in XML format; ignore all other parameters -db <String> This is the name of the database Default = `nr' -in <File_In> A file with the query Default = `stdin' -out <File_Out> The output file Default = `stdout' -evalue <Real> E-value threshold for saving hits Default = `0' -penalty <Integer> Penalty score for a mismatch Default = `0' -reward <Integer> Reward score for a match Default = `0' -matrix <String> Scoring matrix name Default = `BLOSUM62' -logfile <File_Out> File to which the program log should be redirected -conffile <File_In> Program's configuration (registry) data file -version Print version number; ignore other arguments -version-full Print extended version data; ignore other arguments -dryrun Dry run the application: do nothing, only test all preconditions
Example Job
Using a blast binary to demonstrate howto submit a typical ncbic++toolkit binary program to the threaded queue:
Step 1) Prepare blast database:
ssh saw.sharcnet.ca ssh saw-dev1 wget http://mirrors.vbi.vt.edu/mirrors/ftp.ncbi.nih.gov/blast/db/FASTA/env_nr.gz gunzip env_nr.gz mkdir -p db/blast/
makeblastdb -in env_nr -dbtype prot -out db/blast/sorted_env_nr -max_file_sz 500MB
ls db/blast/sorted_env_nr.*.* db/blast/sorted_env_nr.00.phr db/blast/sorted_env_nr.01.phr db/blast/sorted_env_nr.02.phr db/blast/sorted_env_nr.00.pin db/blast/sorted_env_nr.01.pin db/blast/sorted_env_nr.02.pin db/blast/sorted_env_nr.00.psq db/blast/sorted_env_nr.01.psq db/blast/sorted_env_nr.02.psq
Step 2) Create input file:
echo -e ">test\nGGG" >> test.txt
cat test.txt >test GGG
Step 3) Perform a query:
sqsub -r 1h -q threaded -n 4 --mpp=1G -o ofile.%J blastp -query test.txt -db db/blast/sorted_env_nr -task blastp -out test.out -evalue 0.001 -num_threads 4
Step 4) Check results in output file test.out:
cat test.out
General Notes
Usage Help
To determine the arguments and options a given binary program accepts, use the -help switch on the command line. Using blastp for example purposes once again, the initial snippet of output is pasted below. Unfortunately there are no man pages present, however one can consult the online docs https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/ for similar information as the complete toolkit has been installed:
[roberpj@orc-login2:/opt/sharcnet/ncbic++toolkit/18.0.0/gcc630/bin] ./blastp -help USAGE blastp [-h] [-help] [-import_search_strategy filename] [-export_search_strategy filename] [-task task_name] [-db database_name] [-dbsize num_letters] [-gilist filename] [-seqidlist filename] [-negative_gilist filename] [-entrez_query entrez_query] [-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm] [-subject subject_input_file] [-subject_loc range] [-query input_file] [-out output_file] [-evalue evalue] [-word_size int_value] [-gapopen open_penalty] [-gapextend extend_penalty] [-qcov_hsp_perc float_value] [-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value] [-xdrop_gap_final float_value] [-searchsp int_value] [-sum_stats bool_value] [-seg SEG_options] [-soft_masking soft_masking] [-matrix matrix_name] [-threshold float_value] [-culling_limit int_value] [-best_hit_overhang float_value] [-best_hit_score_edge float_value] [-window_size int_value] [-lcase_masking] [-query_loc range] [-parse_deflines] [-outfmt format] [-show_gis] [-num_descriptions int_value] [-num_alignments int_value] [-line_length line_length] [-html] [-max_target_seqs num_sequences] [-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo] [-use_sw_tback] [-version] DESCRIPTION Protein-Protein BLAST 2.5.1+ OPTIONAL ARGUMENTS -h Print USAGE and DESCRIPTION; ignore all other parameters -help Print USAGE, DESCRIPTION and ARGUMENTS; ignore all other parameters -version Print version number; ignore other arguments *** Input query options -query <File_In> Input file name Default = `-' -query_loc <String> Location on the query sequence in 1-based offsets (Format: start-stop) etc etc
References
o NCBI C++ Toolkit Homepage
http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/
o Public Access to the Source Code via FTP
https://ncbi.github.io/cxx-toolkit/pages/ch_getcode_svn#ch_getcode_svn.ftp_download