Appendix V. Software and Tools

0) Process files in different format

0.1) sequence

0.2) alignment

0.3) interval

1) Homolog analysis

  • blast: 方便的网页工具

  • blat: a blast like tool

  • mmseqs: 比blast更现代的同源搜索工具,推荐本地进行大量计算时使用

  • diamond: 蛋白的同源搜索工具

  • hmmer: profile hmm based search for protein and nucleotide sequence

  • infernal: profile SCFG based search for structured noncoding RNA

  • hh-suite: profile hmm to profile hmm alignment

1.3) Multiple sequence alignment

2) Genome Browsers

see more in our Tutorial

3) DNA-seq

(3.1) Mapping and QC

(3.2) Variant Calling

(3.3) Assembly

denovo assembly software

  • SPAdes

    • the sub-utility metaSPAdes is designed for metagenome assembly

  • megahit: designed for metagenome assembly

(3.4) CNV

(3.5) SV (structural variation)

4) RNA-seq

(4.1) RNA-seq

(4.2) Single Cell RNA-seq (scRNA-seq)

  • awesome-single-cell: a collection of single cell analysis tools

  • seurat: a widely used R package

  • scanpy: a widely used python package

  • monocle: Trajectory analysis

  • cellphonedb: Cell-cell interaction analysis

  • scenic: Transcriptional regulatory network

  • Tutorials

    • https://bioconductor.org/books/release/OSCA/

    • https://github.com/theislab/single-cell-tutorial

Nature Biotechnology 2020 38(3):254-257

Software nameDeveloperPrice structurePlatform-specificRelevant stages of experiment

10X Genomics

Free download

10X Chromium

Raw read alignment, QC and matrix generation for scRNA-seq and ATAC-seq; data normalization; dimensionality reduction and clustering

10X Genomics

Free download

10X Chromium

Visualization and analysis

Partek

License

No

Complete data analysis and visualization pipeline for scRNA-seq data

Qlucore

License

No

scRNA-seq data filtering, dimensionality reduction and clustering, visualization

Takara Bio

Free download

Takara ICell8

Raw read alignment and matrix generation for scRNA-seq

Takara Bio

Free download

Takara ICell8

Clustering and analysis of mappa data

Fluidigm

Free download

Fluidigm C1 or Biomark

Analysis and visualization of differential gene expression data for scRNA-seq

FlowJo/BD Biosciences

License

No

Data normalization and QC, dimensionality reduction and clustering, analysis and visualization

Seven Bridges/BD Biosciences

License

BD Rhapsody and Precise

Cloud-based raw read alignment, QC and matrix generation

Mission Bio

Free download

Mission Bio Tapestri

Analysis of single-cell genomics data

Illumina

License

Illumina SureCell libraries

Raw read alignment and matrix generation

Qiagen

License

No

Raw read alignment, QC and matrix generation, dimensionality reduction and clustering

4.3 Assembly

  • Trinity: 利用RNA-seq数据进行转录本组装

5) Interactome

(5.1) ChIP-seq

  • MACS: peak calling

  • homer: peak calling, motif finding, etc

  • ChIPseeker: visualization and annotation

(5.2) CLIP-seq

(5.3) Motif analysis

sequence

  1. MEME motif based sequence analysis tools http://meme-suite.org/

  2. HOMER Software for motif discovery and next-gen sequencing analysis http://homer.ucsd.edu/homer/motif/

structure

  1. RNApromo Computational prediction of RNA structural motifs involved in post transcriptional regulatory processes https://genie.weizmann.ac.il/pubs/rnamotifs08/

  2. GraphProt modeling binding preferences of RNA-binding proteins http://www.bioinf.uni-freiburg.de/Software/GraphProt/

6) Epigenetic Data

(6.1) ChIP-seq

(6.2) DNAase-seq

(6.3) ATAC-seq

(7) Microbe data analysis

  • kraken2: k-mer based fast metagenome reads classification

  • metaphlan: marker gene based microbe taxonomy abundance estimation

  • motu: marker gene based microbe taxonomy abundance estimation

  • maxbin: binning contigs into metagenome-assembled genomes (MAGs)

  • mash: rapid estimation of distance between genome

  • drep: pick representative genome from sample-wise assembly

  • prodigal: prokaryote gene prediction

  • prokka: pipeline for prokaryote genome annotation

  • qiime2: 16S amplicon sequencing data analysis

More: Shared tools and scripts

More: Software for the ages

SoftwarePurposeCreatorsKey capabilitiesYear releasedCitationsa

BLAST

Sequence alignment

Stephen Altschul, Warren Gish, Gene Myers, Webb Miller, David Lipman

First program to provide statistics for sequence alignment, combination of sensitivity and speed

1990

35,617

R

Statistical analyses

Robert Gentleman, Ross Ihaka

Interactive statistical analysis, extendable by packages

1996

N/A

ImageJ

Image analysis

Wayne Rasband

Flexibility and extensibility

1997

N/A

Cytoscape

Network visualization and analysis

Trey Ideker et al.

Extendable by plugins

2003

2,374

Bioconductor

Analysis of genomic data

Robert Gentleman et al.

Built on R, provides tools to enhance reproducibility of research

2004

3,517

Galaxy

Web-based analysis platform

Anton Nekrutenko, James Taylor

Provides easy access to high-performance computing

2005

309b

MAQ

Short-read mapping

Heng Li, Richard Durbin

Integrated read mapping and SNP calling, introduced mapping quality scores

2008

1,027

Bowtie

Short-read mapping

Ben Langmead, Cole Trapnell, Mihai Pop, Steven Salzberg

Fast alignment allowing gaps and mismatches based on Burrows-Wheeler Transform

2009

1,871

Tophat

RNA-seq read mapping

Cole Trapnell, Lior Pachter, Steven Salzberg

Discovery of novel splice sites

2009

817

BWA

Short-read mapping

Heng Li, Richard Durbin

Fast alignment allowing gaps and mismatches based on Burrows-Wheeler Transform

2009

1,556

Circos

Data visualization

Martin Krzywinski et al.

Compact representation of similarities and differences arising from comparison between genomes

2009

431

SAMtools

Short-read data format and utilities

Heng Li, Richard Durbin

Storage of large nucleotide sequence alignments

2009

1,551

Cufflinks

RNA-seq analysis

Cole Trapnell, Steven Salzberg, Barbara Wold, Lior Pachter

Transcript assembly and quantification

2010

710

IGV

Short-read data visualization

James Robinson et al.

Scalability, real-time data exploration

2011

335

N/A, paper not available in Web of Science.

From: The anatomy of successful computational biology software

Last updated