Bioinformatics Tutorial

生物信息学实践教程

Teaching Philosophy

🎦 Study and Practice | 格物致知 知行合一

"Tell me and I forget. Teach me and I remember. Involve me and I learn." - Benjamin Franklin

We teach professional skills in bioinformatics. These skills are not just running software. They will give you the freedom of exploring various real data.

Aim

写在前面的话

相对于过去,突然地,我们发现数据不是太少而是太多,信息不是匮乏而是繁杂,新一代人的重要能力是“鉴别”和“挖掘”。

对生物信息学的工作而言,最重要的、最有用的基本工具和技能过去一直是,我相信很长一段时间也会始终是:

  1. google

  2. wikipedia

  3. 知乎

We aim to teach basic data skills that give you freedom.

  • Running bioinformatics software isn’t all that difficult, doesn’t take much skill, and it doesn’t embody any of the significant challenges of bioinformatics.…These data skills give you freedom

  • I believe these two qualities — reproducibility and robustness.

  • So what is a reproducible bioinformatics project? At the very least, it’s sharing your project’s code and data.

  • In wet lab biology, when experiments fail, it can be very apparent, but this is not always true in computing. Electrophoresis gels that look like Rorschach blots rather than tidy bands clearly indicate something went wrong. Unfortunately, without prior expectations, it can be quite difficult to distinguish good results from bad results.

  • The easy way to ensure everything is working properly is to adopt a cautious attitude , and check everything between computational steps.

  • You will almost certainly have to rerun an analysis more than once.

  • Write Code for Humans, Write Data for Computers

  • Use Existing Libraries Whenever Possible

  • Treat Data as Read-Only

  • Document Everything (-- Too geeky?) Just as a well-organized laboratory makes a scientist’s life easier, a well-organized and well-documented project makes a bioinformatician’s life easier.

-- <<Bioinformatics Data Skills>>

Courses required before this tutorial

  1. 基本生物课程: 如《遗传学》和/或《分子生物学》

  2. 基本统计课程: 如《概率论》和/或《生物统计》

  3. 基本数学课程: 如《微积分》和《线性代数》

  4. 基本计算机课程:如 《Linux》和《C或Python语言》

Major Authors

Yumin Zhu, Gang Xu, Zhuoer Dong, Yinghui Chen, Meifeng Zhou, Xupeng Chen, Xiaocheng Xi, Xi Hu, Jingyi Cao, Xiaofan Liu, Weihao Zhao, Siqi Wang and Zhi J. Lu

SectionMajor Authors

Part I. Basic Skills

1.Setup

Zhi John Lu

1.1 Docker

Gang Xu/Yunfan Jin

1.2 Cluster

Gang Xu/Xiaofan Liu/Yunfan Jin

2.Linux

Zhi John Lu

2.1 Basic Command

Xi Hu

2.2 Practice Guide

Xi Hu/Zhuoer Dong

2.3 Linux Bash

Gang Xu

3.R

3.1 R Basics

Zhuoer Dong

3.2 Plot with R

Xiaochen Xi/Zhuoer Dong

4.Python

Yuhuan Tao

PART II. BASIC ANALYSES

1.Blast

Gang Xu

2.Conservation Analysis

Xi Hu

3.Function Analysis

3.1 GO

Gang Xu

3.2 KEGG

Gang Xu

3.3 GSEA

Zhuoer Dong

4.Clinical Analysis

4.1 Survival Analysis

Xiaochen Xi/Yumin Zhu

Part III. NGS DATA ANALYSES

1.Mapping

Meifeng Zhou/Yumin Zhu

1.1 Genome Browser

Xiaofan Liu/Shang Zhang

1.2 bedtools and samtools

Xiaofan Liu/Yunfan Jin

2.RNA-seq

2.1 Expression Matrix

Xiaofan Liu

2.2 Differential Expression with Cufflinks

Meifeng Zhou/Shang Zhang

2.3 Differential Expression with DEseq2 and edgeR

Xinzhe Ni/Shang Zhang

2.4 Alternative Splicing

Zhuoer Dong/Shang Zhang

3.ChIP-seq

Jingyi Cao/Xiaofan Liu

4.Motif

4.1 Sequence Motif

Yumin Zhu

4.2 Structure Motif

Yumin Zhu

5.Network

5.1 Co-expression Network

Xiaochen Xi

5.2 miRNA Targets

Yumin Zhu

5.3 CLIP-seq(RNA-Protein Interactions)

Yumin Zhu/Xiaofan Liu

6.RNA Regulation Analyses

6.1 Alternative Splicing

Zhuoer Dong/Shang Zhang

6.2 APA (Alternative Polyadenylation)

Yumin Zhu

6.3 Chimeric RNA

Yinghui Chen

6.4 RNA Editing

Yumin Zhu

6.5 SNV/INDEL

Yinghui Chen

6.6 RNA Modification

6.7 RNA Degradation

6.8 Translation:Ribo-seq

Yumin Zhu/Weihao Zhao/Xiaofan Liu

6.9 RNA Structure

Yumin Zhu/Xiaofan Liu

Part IV. MACHINE LEARNING

1.Machine Learning Basics

Xiaofan Liu/Xupeng Chen/Zhi John Lu

1.1 Data Pre-processing

Xinzhe Ni/Xiaofan Liu

1.2 Data Visualization & Dimension Reduction

Xinzhe Ni/Xiaofan Liu

1.3 Feature Extraction and Selection

Xinzhe Ni/Xiaofan Liu

1.4 Machine Learning Classifiers/Models

Xinzhe Ni/Xiaofan Liu

1.5 Performance Evaluation

Xiaofan Liu

2.Machine Learning with R

Xupeng Chen/Xiaofan Liu

3.Machine Learning with Python

Xupeng Chen/Xiaofan Liu

Part V. QUIZ

1.Precision Medicine - exSEEK

Xiaofan Liu/Xupeng Chen

2.RNA Regulation - RiboShape

Xiaofan Liu/Yizi Zhao

3.Single Cell Data Analysis

Yu Li

Appendix

Appendix I. Keep Learning

Zhi John Lu

Appendix II. Databases & Servers

Yumin Zhu

Appendix III. How to Backup

Gang Xu/Zhi John Lu

Appendix IV. Teaching Materials

Gang Xu/Xiaofan Liu/Zhi John Lu

Appendix V. Software and Tools

Yumin Zhu

Contact Us

Copyright © 2022 Lu Lab

https://www.apache.org/licenses/LICENSE-2.0

2016-2022年于清华园

本书主要用于清华大学本科生课《生物信息学》和博士生课《生物信息学实践》。

Last updated