文献阅读 | 基于ATAC-seq数据的SNV与indels的发现
Discovering single nucleotide variants and indels from bulk and single-cell ATAC-seq
文獻鏈接:https://www.biorxiv.org/content/10.1101/2021.02.26.433126v1.full
研究背景
1、genetic variants的發現
| 特點 | 成本高,大部分的reads來自非調控區、非編碼區 | 低成本,reads來自調控區(有功能的突變) |
| 應用 | genetic variants discovery常用方法 | 未被系統評估 |
之前有一些研究評估了sc RNA-seq數據中的單核苷突變檢測方法的表現(參考鏈接:https://pubmed.ncbi.nlm.nih.gov/31744515/ ),但是尚未有研究系統評估一些單核苷突變檢測方法在ATAC-seq數據上的表現。
(1)評估7種variant callers工具對bulk ATAC-seq和sc ATAC-seq數據,在SNVs和indels方面的預測效果。
(2)整合上述variants callers的結果,并開發出整體表現具有顯著優勢的VarCA預測工具。
主要結果
1、Variant Callers的表現(前幾名)
- SNV discovery:GATK、VarScan2、VarDict
- Indels discovery:GATK、VarScan2、Menta、VarDict
2、VarCA的表現
| SNVs | precision:0.99|recall:0.95 | precision:0.98|recall:0.94 |
| indels | precision:0.93|recall:0.80 | precision:0.82|recall:0.82 |
-
VarCA achieves substantially better performance than any individual method and its recalibrated quality scores can be used to filter for high confidence variants.
-
Application of VarCA to single-cell ATAC-seq datasets could potentially reveal the presence of somatic mutations that are present in only some subsets of cells.
VarCA的局限性:
(1)只適用于雙端測序的數據集。
(2)只能對SNVs、indels突變類型進行識別。
操作步驟
1、bulk ATAC-seq/Single cell ATAC-seq 數據處理步驟
| Step1 | BWA-MEM:將雙端的reads與參考基因組比對 | 10x Single Cell ATAC pipeline:比對、聚類 |
| Step2 | samtools:過濾reads | samtools、pysam:過濾reads |
| Step3 | MACS2:識別峰值 | MACS2:識別峰值 |
| Step4 | VarCA:variants detection | VarCA:variants detection |
2、VarCA的簡介
https://github.com/aryarm/varCA
- prepare subworkflow
- runs multiple variant callers on aligned ATAC-seq reads
- gathers the output of these callers together into a single dataset in variant call format (VCF)
- classify subworkflow
- uses the output from the prepare subworkflow to predict variants within ATAC-seq peaks
- outputs a new VCF file containing the predictions
總結
以上是生活随笔為你收集整理的文献阅读 | 基于ATAC-seq数据的SNV与indels的发现的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 记SpringBoot aplicati
- 下一篇: ## 如何注册微信公众号