This project was conducted in the first round of SHARCNET's dedicated programming support competition. It mainly concerned developing a program that reads in the sequences of all genes present on a specified set of bacterial genomes and then creates clusters of related genes.
The program was multi-staged:
- download necessary genome files and prepare them
- submit BLAST comparisons of all genomes
- generate a listing of all significant alignment matches (hits) from BLAST output
- analyze hits based on some criteria for what consitutes an interesting link between genes
- build clusters (basically graphs) showing genetic links
The program was written as a series of perl scripts.
Results obtained with the final workflow of the project were presented in this publication: Origin and evolution of gene families in Bacteria and Archaea.
For further information about the project feel free to contact Hugh Merz.