From Documentation
Jump to: navigation, search

Update November 2014: These instructions were written for the older SHARCNET environment. It should be possible to build MAFFT for the present SHARCNET environment without any deviation, although one should use kraken now instead of whale.

MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼10,000 sequences), etc.

Documentation

Version 6 documentation can be found online here. This page will show you how to install this code on SHARCNET and how to submit jobs.

Installing

Download the source version from the website

ssh whale.sharcnet.ca
cd /work/$USER
wget http://mafft.cbrc.jp/alignment/software/mafft-6.717-with-extensions-src.tgz

You should check to see which version is the latest on their website, this walkthrough used the most current version as of May 2010. Next, unpack the source archive:

tar -xvzf mafft-6.717-with-extensions-src.tgz
cd mafft-6.717-with-extensions
less readme

Edit core/Makefile and extensions/Makefile . Set PREFIX to point at your /home directory:

PREFIX = /home/$USER
#PREFIX = /usr/local

And set CC to point at the SHARCNET compiler wrapper:

CC = cc
#CC = gcc

Now you should ensure you have the right directories in your /home folder:

mkdir /home/$USER/lib
mkdir /home/$USER/bin
mkdir /home/$USER/man

Now compile and install mafft:

cd core
make clean
make
make install
cd ../extensions
make clean
make
make install

To use this installation of MAFFT you will need to make sure that /home/$USER/bin is in your $PATH variable, eg. you should add a line like:

PATH=$PATH:/home/$USER/bin

In your ~/.bash_profile shell configuration file.

Running MAFFT

MAFFT is serial, so you should run it on whale.

Now you can submit a job with:

 ssh whale.sharcnet.ca
 mkdir /work/$USER/mafft_run1
 cd /work/$USER/mafft_run1
 sqsub -q serial -r 1d -o mafft_test mafft --auto input_sequences > mafft_output

Where input_sequences is a list of sequences in FASTA format residing in the /work/$USER/mafft_run1 directory. Both the MAFFT and job scheduler output will be written to mafft_output.