[ Multi-omic Analysis ] Immune Response Against COVID-19: an -omics approach

Notice

Recent Posts

Recent Comments

Link

Link to blog "한 사람의 일상"

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

바이오 대표

[ Multi-omic Analysis ] Immune Response Against COVID-19: an -omics approach 본문

My works

[ Multi-omic Analysis ] Immune Response Against COVID-19: an -omics approach

바이오 대표 2022. 1. 19. 00:08

여러 -omic 데이터를 통합하여 (more data, potential to more power) 우리는 complex biological big data 를 분석할 수 있다.

( Genomics, Transcriptomics, Proteomics, Epigenomics )

* Epigenome: DNA나 히스톤단백질에 직접 결합하여 유전자의 발현을 직접 조절하는 화학물질과 단백질의 집합체

Abstract

To understand the human responses to the virus (COVID-19), especially in the Immune response. 따라서, 해당 페이퍼의 목표는 COVID-19로인한 면역 반응에서 큰 역할을 하는 immune genes를 발견하는 것이다.

사용한 Multiomic Datasets:

Microarray (E-MTAB-8871)
RNA-seq (E-MTAB-9221)
RNA-seq (GSE152418)
Chip-seq (GSE108881)

Introduction

면역반응은 [1] innate immune response [2] adaptive immune system로 나뉠 수 있다. 반응은 T cell 과 B cell 에 의해 일어나고 T Cell에는 CD4과 CD8 T cell (NK) 이 있다. CD4 T cell (helper T cell)은 cytokines을 분비해서, 면역 반응을 불러온다. 하지만 해당 과정에서 cytokine이 대량 분비시 문제가 될 수 있다.

# Used -Omics Data

Microarray	# Whole Blood Cells: Erythrocytes, Leukocytes, and platelets - Transcriptomics profiles of blood sampled via NanoString Human Immunology V2 Panel - Negative and positive controls in probe sets(for normalization) - Time series samples
RNA-seq (PBMC samples)	# Peripheral Blood Mononuclear Cells (PBMCs): Lymphocytes - Sequencing w/ a single Illumina HiSeq2000 flow cell - Covid patients in four stages (convalescent, moderate, severe, and intensive care unit)
RNA-seq (organoid)	# Organoids (Lung Epithelial cells) infected with COVID - Sequencing w/ Illumina NovaSeq 6000
ChiP-seq	# Calu3 (human lung cancer cell) infected w/ MERS-Cov (middle east respiratory syndrome coronavirus) - PMS (peripheral blood smear) sonicated and immunoprecipitated w/ anti-H3K4me, anti-H3K27me3 * H3kme3: epigenomic modification on Histon H3 for gene expression regulation (발현 촉진) * H3K27me3: associated with downregulation (발현 억제)

* No nuclei: Erythrocytes, platelets

* Mononuclei: Lymphocytes (T cells, B cells, NK cells)

* Multi-lobed nuclei: Granulocyte ( Neutrophils,basophils, eosinophils)

* White blood cell: Blood cells (except Erythrocytes or platelets)

Analysis Methods

# Used R-packages and Analysis Process

	Preprocessing & Expression Analysis	Gene Set analysis
Microarray	NanoStringR(nanoR) [0] rcc files [1] QC, background correction, normalization * background correction by SD * geometric mean for positive control normalization * housekeeping normalization [2] time-series categorization limma [3] Differential Expression Analysis [4] FDR Correction (Benjamini-Hochberg)	goana (limma) - to determine Gene Ontology term kegga (limma) - to find over-represented pathways in DEG
RNA-seq (PBMC samples)	STAR [0] Reads, mapped to human genome (GRCH38) - counted w/ STAR using htseq-count EdgeR [1] DGE - Normalization - DE through a quasi-likelihood F-test [2] FDR Correction (Benjamini-Hochberg)	"
RNA-seq (organoid)	SRA toolkit [1] SRA (Sequence Read Archive) --> Fastq [2] QC, trimming (Trimmomatic) [3] Aligned onto GRCH38 (Kallisto) EdgeR [4] Count Normalization [5] Statistical analysis	"
ChiP-seq	SRA toolkit [1] SRA (Sequence Read Archive) --> Fastq [2] QC, trimming (Trimmomatic) [3] Aligned onto GRCH38 (STAR) [4] Peak Callings (MACS2) [5] MACS2 broad peak file --> GRanges [6] Identify the overlap peaks w/ H3K27me3/H3K4me3	"

* Type 1 error (FDR: False Discovery rate) Correction # 틀렸는데 맞다고 판단

[1] Benjamini-Hochberg (BH) https://www.youtube.com/watch?v=K8LQSvtjcEo&t=633s

[2] Bonferroni ( original p-vale/ # of test performed)

* GRCH38: Genome Reference Consortium Human Build 38

* htseq-count: counts for each gene how many aligned reads overlap its exons

* Ensembl ID 또는 Entrez ID를 알고 있을 때, 그와 관련된 정보들을 org.Hs.eg.db를 통해 annotation 할 수 있다.

# Used Visualization

plotPCA/ plotMDS	Shows correlation or clustering by dimensionality-reduction
volcanoPlot	scatter plot that shows statistical significance (P-value) vs FC
plotBCV (edgeR)	Shows estimate Tagwise, Common, Trend dispersions * Tagwise: allow for a different value for the dispersion to be used for each gene
MA plot	log FoldChange vs Average Expression Level
pheatmpap	Shows the magnitude of a phenomenon as color

Conclusion

각각의 데이터에서는 HLA, Immune-related genes (immunoglobulin fragments (IgG receptor, IgA, IgM), B cell receptor, interleukin gene) 같은 protein-coding genes 들이 expressed more. 전체적인 결과는 다음과 같다.

[1] 모든 데이터셋에서 공통적으로 의미 있는 gene = Cytokine (HLA-DPA1, PTGER4, NFIL3)

[2] gene set analysis 를 통해 감염 환자들에게서 mitochondria 와 oxidative phosphorylation 가 바뀐 점이 확인되었다.

네개의 데이터가 not that compatible 이여서 생각만큼 powerful 한 결과를 얻지 못했지만 좀더 연관성있는 데이터들을 이용한다면 보다 크게 more data, more powerful result 를 얻을 수 있을 것이다.

* 아쉬운점: 해당 프로젝트는 COVID-19 가 발생한지 (2019.12) 일년도 채 되지 않았을 때 (2020.10) 진행하였기에 충분한 데이터가 부족하였다.

저작자표시 (새창열림)

'My works' 카테고리의 다른 글

[ CITE-seq DE Analysis ] Differential expression analysis for the protein component of CITE-seq data (CiteFuse) (0)	2022.01.27
[ RNA-seq DE Analysis ] EdgeR/limma (COVID-19 vs Health) (0)	2022.01.26
[ MHC genotyping Tools Benchmarking ] Benchmarking of NGS-based MHC-II genotyping algorithms (0)	2022.01.22

'My works' Related Articles

바이오 대표

[ Multi-omic Analysis ] Immune Response Against COVID-19: an -omics approach 본문

[ Multi-omic Analysis ] Immune Response Against COVID-19: an -omics approach

Abstract

Introduction

Analysis Methods

Conclusion

'My works' 카테고리의 다른 글

티스토리툴바