Chip-seq分析笔记
目錄
前言
一、軟件安裝
二、創(chuàng)建環(huán)境及安裝軟件
1.創(chuàng)建環(huán)境 chipseq
2.chipseq 環(huán)境下安裝軟件
三、具體分析步驟
1.數(shù)據(jù)來(lái)源
2.下載數(shù)據(jù)并重命名
3.fastq 文件轉(zhuǎn)換(--sra-id?此步驟把 SRR 的名稱(chēng)改掉,加快運(yùn)行速度,速度非常快)
4.質(zhì)控
4.1 trim_galore
4.2 FastQC質(zhì)控報(bào)告生成(舊的數(shù)據(jù))
5.bowtie2比對(duì)(15min/file)
6.samtools sam?轉(zhuǎn)換?bam
6.1view &sort
6.2 index
7.bamCoverage?BW轉(zhuǎn)換 bam 轉(zhuǎn)換成 bw
8.IGV查看 bw文件 peak
9.MACS2 callpeak
10.MACS2 差異分析(2021-10-15 待完善)
11.Homer 分析 motif
11.1查看所有的?list:
11.2下載注釋信息
11.3 Create a tag directory
11.4 Differentially Bound Peaks(找差異peak)
11.5?Annotate peaks(Peak注釋)
用 R 把差異peak 和 Peak 注釋合并:fread,merge 函數(shù)
11.6 其他功能
12. ROSE 尋找超級(jí)增強(qiáng)子 SE(super enhancer)
12.1 參考教程:?ROSE | 超級(jí)增強(qiáng)子鑒定
12.2 ROSE 安裝見(jiàn)上
12.3 ROSE 尋找 SE
前言
自己做筆記自己看的
一、軟件安裝
? 設(shè)備:mac m1 電腦
- 軟件程序安裝
- 參考教程Macbook Pro M1芯片Python開(kāi)發(fā)環(huán)境配置_朝歌晚酒南梔雪的博客-CSDN博客
- xcode
- item2
- git
- homebrew brew install wget #安裝 wget
- anaconda-pkg
?? ? ? ? ? ? ? ? ? ???????Anaconda | Individual Edition??下載鏈接
? ? ? ? ? ? ? ? ? ?Installing on macOS — Anaconda documentation?幫助鏈接
? ? ? ? ? ? ? ?pycharm CE
二、創(chuàng)建環(huán)境及安裝軟件
1.創(chuàng)建環(huán)境 chipseq
conda? create -n chipseq? python=3conda activate chipseq #激活環(huán)境conda deactivate2.chipseq 環(huán)境下安裝軟件
ps:一項(xiàng)一項(xiàng)安裝,不然容易卡住
conda install -y bioconda parallel-fastq-dump ?#SRR→FASTQ 轉(zhuǎn)換conda install -y trim-galore #質(zhì)控conda install -y?bioconda bowtie2 # fastq比對(duì)到 hg19brew tap homebrew/science?brew install samtools?#sam 轉(zhuǎn)化為 bamconda install -y macs2 #峰值定量,差異分析conda install -y conda-forge libgfortran #macs2 差異分析輔助用conda install -y bioconda deeptools #可視化#包含 bamcoveragehttps://deeptools.readthedocs.io/en/latest/content/tools/bamCoverage.htmlIGV 下載https://software.broadinstitute.org/software/igv/download #可視化ROSE 安裝
https://bitbucket.org/young_computation/rose/src/master/ ROSE 下載 #把所有要處理的文件都放到 rose 文件夾:sorted.bam, bed,bam.bai #創(chuàng)建 Python2.7conda create -n rose python=2.7conda activate roseHomer安裝
- 公眾號(hào)參考?HOMER | chipseq數(shù)據(jù)進(jìn)行peaks的差異分析
- homer安裝:?chipseq 環(huán)境下,conda install -c bioconda homer
- 下載configureHomer.pl ?http://homer.ucsd.edu/homer/configureHomer.pl
- 文件放在想要安裝 homer?的文件夾中,/Users/xusiqi/chip/homer
- 目錄文件夾中安裝?homer :?
- perl?configureHomer.pl -list #查看所有的?list
三、具體分析步驟
1.數(shù)據(jù)來(lái)源
使用數(shù)據(jù)Wang J, Zou JX, Xue X, Cai D et al.?ROR-γ drives androgen receptor expression and represents a therapeutic target in castration-resistant prostate cancer.?Nat Med?2016 May;22(5):488-96. PMID:?27019329
| GSM1868876: C4-2B vehicle Input chip seq; Homo sapiens; ChIP-Seq(SRR2242690) https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR2242690 GSM1868866: C4-2B vehicle AR chip seq; Homo sapiens; ChIP-Seq(SRR2242680) https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR2242680 |
| GSM1868862: 2nd time C4-2B vehicle AR chip seq; Homo sapiens; ChIP-Seq(SRR2242676)? https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR2242676 GSM1868864: 2nd time C4-2B vehicle Input chip seq; Homo sapiens; ChIP-Seq(SRR2242678) https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR2242678 |
2.下載數(shù)據(jù)并重命名
wget https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR8557353/SRR8557353 mv???? SRR8557353????? input.h3k.conwget https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos3/sra-pub-run-20/SRR8557351/SRR8557351.1 mv? SRR8557351.1?? ip.h3k.conwget https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos3/sra-pub-run-21/SRR8557354/SRR8557354.1 mv? SRR8557354.1?? input.h3k.xywget https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR8557352/SRR8557352 mv?? SRR8557352???? ip.h3k.xy3.fastq 文件轉(zhuǎn)換(--sra-id?此步驟把 SRR 的名稱(chēng)改掉,加快運(yùn)行速度,速度非常快)
parallel-fastq-dump —sra-id input.h3k.con —threads 35? —outdir out/ —split-files —gzip parallel-fastq-dump —sra-id ip.h3k.con?? —threads 35? —outdir out/ —split-files —gzip parallel-fastq-dump —sra-id? input.h3k.xy?? —threads 35? —outdir out/ —split-files —gzip parallel-fastq-dump —sra-id?? ip.h3k.xy?? —threads 35? —outdir out/ —split-files —gzip4.質(zhì)控
4.1 trim_galore
- trim_galore(只能單線(xiàn)程)
- Cutadapt:去接頭,去除3端低質(zhì)量堿基,去除長(zhǎng)度太短的序列
- ????report.txt
- ????fq.gz(用來(lái)比對(duì))
- Cutadapt:去接頭,去除3端低質(zhì)量堿基,去除長(zhǎng)度太短的序列
4.2 FastQC質(zhì)控報(bào)告生成(舊的數(shù)據(jù))
- fastqc -o?/Users/xusiqi/CHIP/out?-t 6?/Users/xusiqi/CHIP/out/SRR2242680_1_trimmed.fq.gz
- fastqc -o?/Users/xusiqi/CHIP/out?-t 6?/Users/xusiqi/CHIP/out/SRR2242690_1_trimmed.fq.gz
- fastqc -o?輸出絕對(duì)路徑?-t 線(xiàn)程數(shù) 輸入文件絕對(duì)路徑
- -o?后面為文件輸出絕對(duì)路徑,-t 6(為線(xiàn)程數(shù)),?最后為輸入文件絕對(duì)路徑;大約 5min/file
5.bowtie2比對(duì)(15min/file)
bowtie2 -p 35 -x /Users/xusiqi/CHIP/index/hg19/hg19 -U? /Users/xusiqi/CHIP/lesson23/out/input.h3k.con_1_trimmed.fq.gz? -S input.h3k.con.sam bowtie2 -p 35 -x /Users/xusiqi/CHIP/index/hg19/hg19 -U? /Users/xusiqi/CHIP/lesson23/out/input.h3k.xy_1_trimmed.fq.gz? -S input.h3k.xy.sam bowtie2 -p 35 -x /Users/xusiqi/CHIP/index/hg19/hg19 -U? /Users/xusiqi/CHIP/lesson23/out/ip.h3k.con_1_trimmed.fq.gz? -S ip.h3k.con.sam bowtie2 -p 35 -x /Users/xusiqi/CHIP/index/hg19/hg19 -U? /Users/xusiqi/CHIP/lesson23/out/ip.h3k.xy_1_trimmed.fq.gz? -S ip.h3k.xy.sam6.samtools sam?轉(zhuǎn)換?bam
6.1view &sort
samtools view -S -b input.h3k.con.sam > input.h3k.con.bam samtools sort input.h3k.con.bam -o input.h3k.con.sorted.bamsamtools view -S -b input.h3k.xy.sam > input.h3k.xy.bam samtools sort input.h3k.xy.bam -o input.h3k.xy.sorted.bamsamtools view -S -b ip.h3k.con.sam >ip.h3k.con.bam samtools sort ip.h3k.con.bam -o ip.h3k.con.sorted.bamsamtools view -S -b ip.h3k.xy.sam > ip.h3k.xy.bam samtools sort ip.h3k.xy.bam -o ip.h3k.xy.sorted.bam6.2 index
samtools index input.h3k.con.sorted.bam samtools index input.h3k.xy.sorted.bam samtools index ip.h3k.con.sorted.bam samtools index? ip.h3k.xy.sorted.bam7.bamCoverage?BW轉(zhuǎn)換 bam 轉(zhuǎn)換成 bw
bamCoverage -e 170 -bs 10 -p 35 -b input.h3k.con.sorted.bam -o input.h3k.con.sorted.bw bamCoverage -e 170 -bs 10 -p 35 -b input.h3k.xy.sorted.bam -o input.h3k.xy.sorted.bw bamCoverage -e 170 -bs 10 -p 35 -b ip.h3k.con.sorted.bam -o ip.h3k.con.sorted.bw bamCoverage -e 170 -bs 10 -p 35 -b ip.h3k.xy.sorted.bam -o ip.h3k.xy.sorted.bw8.IGV查看 bw文件 peak
- IGV?下載?Downloads | Integrative Genomics Viewer
- IGV?打開(kāi)?bw?文件
9.MACS2 callpeak
- macs2 callpeak -t?實(shí)驗(yàn)?sorted.bam?-c???對(duì)照inputsorted.bam?-g hs -B -f BAM -n?輸出名稱(chēng)(無(wú)后綴,簡(jiǎn)單命名)?-q 0.05 macs2 callpeak -t ip.h3k.con.sorted.bam -c? ?input.h3k.con.sorted.bam -g hs -B -f BAM -n con? -q 0.05 macs2 callpeak -t ip.h3k.xy.sorted.bam -c? ?input.h3k.xy.sorted.bam -g hs -B -f BAM -n xy -q 0.05
10.MACS2 差異分析(2021-10-15 待完善)
參考教程:
Call differential binding events · macs3-project/MACS Wiki · GitHub
| 需要的文件名
| d-length? | tags after filtering |
| macs2 predictd -i input.h3k.con.sorted.bam macs2 predictd -i input.h3k.xy.sorted.bam macs2 predictd -i ip.h3k.con.sorted.bam macs2 predictd -i??ip.h3k.xy.sorted.bam | 199 205 | 13313222 24184450 22342330 21410271 |
| 202 | callpeak excle?表中就有 |
11.Homer 分析 motif
公眾號(hào)參考?HOMER | chipseq數(shù)據(jù)進(jìn)行peaks的差異分析
11.1查看所有的?list:
perl configureHomer.pl -list11.2下載注釋信息
perl configureHomer.pl -install hg19 #(安裝hg19的GENOMES,會(huì)自動(dòng)下載human數(shù)據(jù),1.38G,多試幾次,網(wǎng)速可達(dá) 6-7Mb)11.3 Create a tag directory
- homer?下新建一個(gè)?out?文件夾,在此路徑下運(yùn)行以下代碼
makeTagDirectory?文件夾名(tag directory)??-genome?hg19(帶絕對(duì)路徑)??-checkGC??輸入文件 sam(絕對(duì)路徑)
- 輸入文件名:input.h3k.con.sam?input.h3k.xy.sam??ip.h3k.con.sam????ip.h3k.xy.sam
- makeTagDirectory? input.h3k.con ?-genome /Users/xusiqi/chip/homer/data/genomes/hg19 -checkGC /Users/xusiqi/chip/lesson23/out/input.h3k.con.sam makeTagDirectory input.h3k.xy ?-genome /Users/xusiqi/chip/homer/data/genomes/hg19 -checkGC /Users/xusiqi/chip/lesson23/out/input.h3k.xy.sam makeTagDirectory? ip.h3k.con ?-genome /Users/xusiqi/chip/homer/data/genomes/hg19 -checkGC /Users/xusiqi/chip/lesson23/out/ip.h3k.con.sam makeTagDirectory? ip.h3k.xy ?-genome /Users/xusiqi/chip/homer/data/genomes/hg19 -checkGC /Users/xusiqi/chip/lesson23/out/ip.h3k.xy.sam
11.4 Differentially Bound Peaks(找差異peak)
getDifferentialPeaks <peak file> <target tag directory> <background tag directory> [options]
輸入文件名:
con_peaks.narrowPeak ?xy_peaks.narrowPeak
input.h3k.con ? input.h3k.xy ? ?ip.h3k.con ? ? ip.h3k.xy
11.5?Annotate peaks(Peak注釋)
Usage:?annotatePeaks.pl?<peak?file?|?tss>?<genome?version>??[additional?options...]
輸入文件名:con.diffpeaks.csv ? ?xy.diffpeaks.csv
annotatePeaks.pl con.diffpeaks.csv /Users/xusiqi/chip/homer/data/genomes/hg19 > con.diffpeaks.anno.txt annotatePeaks.pl xy.diffpeaks.csv /Users/xusiqi/chip/homer/data/genomes/hg19 > xy.diffpeaks.anno.txt用 R 把差異peak 和 Peak 注釋合并:fread,merge 函數(shù)
11.6 其他功能
- annotatePeaks.pl程序還可以用于創(chuàng)建顯示相對(duì)于給定基因組特征(包括轉(zhuǎn)錄起始位點(diǎn)(TSS)或用戶(hù)想要定義的任何其他區(qū)域)的相對(duì)讀富集的直方圖。由于TSS經(jīng)常用于此目的,所以HOMER為T(mén)SS提供了一個(gè)內(nèi)置注釋(基于RefSeq轉(zhuǎn)錄本)。創(chuàng)建直方圖的關(guān)鍵參數(shù)是“-hist #”和“-size #”選項(xiàng),它們控制直方圖的裝箱大小和總長(zhǎng)度。另一個(gè)重要的選項(xiàng)是“-d”,它指定要為哪些實(shí)驗(yàn)編譯直方圖。
- annotatePeaks.pl tss hg19 -size 8000 -hist 10 -d h3k.con/ h3k.bay/ > output.txt
- 用電子表格程序/Excel打開(kāi)output.txt文件。將注意到第一列給出了到TSS的距離偏移量,然后是對(duì)應(yīng)于每個(gè)實(shí)驗(yàn)的“覆蓋率”、“+標(biāo)記”和“-標(biāo)記”的列。嘗試用第一列作為X-Y線(xiàn)形圖來(lái)查看模式。
12. ROSE 尋找超級(jí)增強(qiáng)子 SE(super enhancer)
12.1 參考教程:?ROSE | 超級(jí)增強(qiáng)子鑒定
12.2 ROSE 安裝見(jiàn)上
12.3 ROSE 尋找 SE
? python ROSE_main.py -g HG19 -i xy_summits.bed -r ip.h3k.xy.sorted.bam -c input.h3k.xy.sorted.bam -o /Users/xusiqi/chip/rose/xy python ROSE_main.py -g HG19 -i con_summits.bed -r ip.h3k.con.sorted.bam -c input.h3k.con.sorted.bam -o /Users/xusiqi/chip/rose/con(未完待續(xù))
總結(jié)
以上是生活随笔為你收集整理的Chip-seq分析笔记的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 解压版tomcat7安装教程
- 下一篇: linux系统的手机刷机包,ubuntu