바이오 대표

Notice

Recent Posts

Recent Comments

Link

Link to blog "한 사람의 일상"

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

목록전체 글 (217)

바이오 대표

[R - 기본기] "dplyr" 을 이용한 Data transformation

요약: filter( ) arrange( ) - order 바꿔줌 select( ) - pick variables (columns) by names mutate( ) - create new variables, existing variables 이용 summarize( ) ** group_by( ) 로 중복 사용 가능 예시 데이터 library(nycflights13) library(tidyverse) flights #> # A tibble: 336,776 x 19 #> year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time #> #> 1 2013 1 1 517 515 2 830 819 #> 2 2013 1 1 533 529 4 8..

R 2023. 1. 10. 08:33

[NGS scATACseq] scATACseq, Cicero를 이용해서 cis-regulatory gene network 분석하기

Cicero Cicero single-cell chromatin accessibility data 를 이용 → co-accessibility examine → cis-regulatory network를 구성하고 분석하기 위한 툴이다. 나는 scATACseq을 분석하기 위해 해당 툴을 이용했다. How? chromatin accessibility data → physically proximity 한 regions of genome 구하기→ 추정되는 enhancer-promoter pair 찾기 → cis-regulatory network 구성 Step 0. Create an input CDS Cicero는 CDS (cell_data_set) class object 를 이용한다. 10x scATACseq 데..

Bioinformatics/Tools 2022. 12. 28. 04:01

[NGS scRNAseq] cellranger count 의 output 파일, summary.html 해석

Cell Ranger OUTPUT FILES Cell ranger 의 output 은 outs/ 폴더에 저장이 된다. 해당 폴더에는 sequencing data, the annotated read sequences, gene expression matrices 등이 존재한다. 각 파일에 대한 더 자세한 정보 Matrices Web Summary .html Secondary Analysis CSV BAM Molecule Info (h5) Loupe File (.cloupe) Summary.html WEL SUMMARY .html cell ranger 가 제공해주는 summary 와 analysis 를 html 형식으로 확인할 수 있다. 다음 figure에서 볼 수 있듯이, 크게 summary 와 gene ..

Bioinformatics/Tools 2022. 12. 19. 09:44

[NGS scRNAseq] Chromium 10x Illumina의 기본이해(workflow)와 Cell ranger count

해당 포스트는 Illumina (10x) 기술을 이용한 single cell RNAseq 에 초점이 맞춰져 있습니다. Chromium 10x single cell - Barcodes and UMI Paired-end sequencing output 은 주로 5’ → 3’ 방향으로 읽힌 두개의 fastq files 이다. 첫번째 Read 1 (R1) 은 항상 primer 의 Cell barcode + UMI (unique molecular identifiers) 부분을 포함하고 Read 2 (R2)는 reverse sequence를 읽는다(figure 1.3을 참고). Sequencing 으로 얻어낸 reads (containing cell barcode, UMI and cDNA) 를 이용하여, trans..

Bioinformatics/Tools 2022. 12. 18. 16:53

[Bioinformatics] NGS 파일 포맷 - fastq, sam, bam, bed, bigwig,,,

NGS 데이터는 sequencer 에 의해서 생성되는 sequencing 파일 (fastq)를 시작으로 모든 분석이 이루어지면서 특정 성격을 띈다. NGS 세계에서 이용되는 많은 데이터 포맷 중, 가장 기본이 되는 파일들의 형식들을 중점으로 정리해보았다. * sequencers 예시: Illumina HiSeq 2500, Illumina NextSeq 500i, Illumina MiSeq,,, FASTQ : sequencing data with scores SAM : fastq파일을 aligning (mapping) 한 output 파일 (사람이 읽을 수 있는 버전) BAM : SAM 을 binary 한 파일로 사람이 읽을 수는 없지만, 용량을 줄일 수 있다. VCF : (Variant Calling F..

Bioinformatics/NGS 기본지식 2022. 12. 4. 10:00

[MACS2] Peak calling with MACS2

MACS2(Model-based Analysis of ChiP-seq) for ChIP-seq or ATAC-seq Peak Calling = Chip-seq 실험에서 enriched aligned reads, genome area 를 찾는 것이다. Chip-seq에서 나온 alignment files (SAM/BAM) 에서 sense(+) strand와 antisense(-) strand 에서의 read densities 가 다름을 확인할 수 있다. 5' ends를 통해 +/- strand를 구분할 수 있다. 통계학을 이용해 각 그룹들의 distribution을 평가하고, background와 비교하여 해당 enrichment site 가 정말 binding site 인지 확인할 수 있다. peak c..

Bioinformatics 2022. 10. 3. 14:56

이전 Prev 1 ··· 3 4 5 6 7 8 9 ··· 37 Next 다음

목록전체 글 (217)

바이오 대표

티스토리툴바