Most of the genomics lessons from data carpentry currently use amazon cloud.
We don’t currently know if we’ll keep using amazon cloud or not.
FastQC provides a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.
FastQC is available for Linux, MacOS and Windows.
Trimmomatic is a java based program that can remove sequencer specific reads and nucleotides that fall below a certain threshold. Trimmomatic can be multithreaded to run quickly.
Trimmomatic is available for Linux, MacOS and Windows.
Bwa is a software package for mapping DNA sequences against a large reference genome, such as the human genome.
Bwa is available for Linux and MacOS.
SAMtools is a suite of programs for interacting with high-throughput sequencing data. Samtools can read/write/edit/index/view SAM/BAM/CRAM format.
SAMtools is available for Linux and MacOS.
BCFtools is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF.
BCFtools is available for Linuc and MacOS
IGV is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
IGV is available for Linux, MacOS and Windows.
You will also need to download a data tarball of a reference genome and fastq files for E. coli:
tar xzf variant_calling.tar.gz