From Documentation
Jump to: navigation, search

FASTA, FASTQ and tabular are basic and ubiquitous formats in bioinformatics. Common manipulations of these kinds of files include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. In this Webminar, we will introduce you to helpful bash commands that are very useful for data manipulation and query. From counting sequences in different formats to searching and modifying motifs, you will learn how to deal with data files with pure bash commands such as grep, sed, and awk. We will also introduce a very useful program, seqkit, that will make your bioinformatics pre-processing significantly easier and more efficient.