GATK之VariantAnnotator
生活随笔
收集整理的這篇文章主要介紹了
GATK之VariantAnnotator
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
VariantAnnotator
簡要說明
用途: 利用上下文信息注釋識別的變異位點(variant calls)
分類: 變異位點操作工具
概要: 根據(jù)變異位點的背景信息(與功能注釋相對)進行注釋。目前有許多的注釋模塊(見注釋模塊一節(jié))可供使用。
輸入文件
用于注釋的VCF文件和可選的BAM文件
輸出文件
注釋完畢的VCF文件
使用案例
對HaplotypeCaller或UnifiedGenotyper的結果中增加每個樣本的深度和dbSNP ID信息。
java -jar GenomeAnalysisTK.jar \-R reference.fasta \-T VariantAnnotator \-I input.bam \-V input.vcf \-o output.vcf \-A Coverage \--dbsnp dbsnp.vcf參數(shù)說明:
-R/--reference_sequence:參考基因組 -T/--analysis_type : 運行的工具 -I/--input_file: 和vcf相應的BAM文件 -o :輸出文件 -V/--varaint: 輸入的VCF文件 -A/--annotation: 要添加哪些注釋項 --dbsnp: 已有的snp信息注釋數(shù)據(jù)庫注HaplotypeCaller和MuTect2也有-A選項,并且有些注釋模塊只能在HaplotypeCaller和MuTect2計算,例如StrandAlleleCountsBySample
如下是 -A可接的內(nèi)容:
注釋模塊
這是官方文檔提供的注釋模塊:
| AS_BaseQualityRankSumTest | Allele-specific rank Sum Test of REF versus ALT base quality scores |
| AS_FisherStrand | Allele-specific strand bias estimated using Fisher's Exact Test * |
| AS_InbreedingCoeff | Allele-specific likelihood-based test for the inbreeding among samples |
| AS_InsertSizeRankSum | Allele specific Rank Sum Test for insert sizes of REF versus ALT reads |
| AS_MQMateRankSumTest | Allele specific Rank Sum Test for mate's mapping qualities of REF versus ALT reads |
| AS_MappingQualityRankSumTest | Allele specific Rank Sum Test for mapping qualities of REF versus ALT reads |
| AS_QualByDepth | Allele-specific call confidence normalized by depth of sample reads supporting the allele |
| AS_RMSMappingQuality | Allele-specific Root Mean Square of the mapping quality of reads across all samples. |
| AS_ReadPosRankSumTest | Allele-specific Rank Sum Test for relative positioning of REF versus ALT allele within reads |
| AS_StrandOddsRatio | Allele-specific strand bias estimated by the Symmetric Odds Ratio test |
| AlleleBalance | Allele balance across all samples |
| AlleleBalanceBySample | Allele balance per sample |
| AlleleCountBySample | Allele count and frequency expectation per sample |
| BaseCounts | Count of A, C, G, T bases across all samples |
| BaseCountsBySample | Count of A, C, G, T bases for each sample |
| BaseQualityRankSumTest | Rank Sum Test of REF versus ALT base quality scores |
| BaseQualitySumPerAlleleBySample | Sum of evidence in reads supporting each allele for each sample |
| ChromosomeCounts | Counts and frequency of alleles in called genotypes |
| ClippingRankSumTest | Rank Sum Test for hard-clipped bases on REF versus ALT reads |
| ClusteredReadPosition | Detect clustering of variants near the ends of reads |
| Coverage | Total depth of coverage per sample and over all samples. |
| DepthPerAlleleBySample | Depth of coverage of each allele per sample |
| DepthPerSampleHC | Depth of informative coverage for each sample. |
| ExcessHet | Phred-scaled p-value for exact test of excess heterozygosity |
| FisherStrand | Strand bias estimated using Fisher's Exact Test |
| FractionInformativeReads | The fraction of reads deemed informative over the entire cohort |
| GCContent | GC content of the reference around the given site |
| GenotypeSummaries | Summarize genotype statistics from all samples at the site level |
| HaplotypeScore | Consistency of the site with strictly two segregating haplotypes |
| HardyWeinberg | Hardy-Weinberg test for transmission disequilibrium |
| HomopolymerRun | Largest contiguous homopolymer run of the variant allele |
| InbreedingCoeff | Likelihood-based test for the inbreeding among samples |
| LikelihoodRankSumTest | Rank Sum Test of per-read likelihoods of REF versus ALT reads |
| LowMQ | Proportion of low quality reads |
| MVLikelihoodRatio | Likelihood of being a Mendelian Violation |
| MappingQualityRankSumTest | Rank Sum Test for mapping qualities of REF versus ALT reads |
| MappingQualityZero | Count of all reads with MAPQ = 0 across all samples |
| MappingQualityZeroBySample | Count of reads with mapping quality zero for each sample |
| NBaseCount | Percentage of N bases |
| OxoGReadCounts | Count of read pairs in the F1R2 and F2R1 configurations supporting the reference and alternate alleles |
| PossibleDeNovo | Existence of a de novo mutation in at least one of the given families |
| QualByDepth | Variant call confidence normalized by depth of sample reads supporting a variant |
| RMSMappingQuality | Root Mean Square of the mapping quality of reads across all samples. |
| ReadPosRankSumTest | Rank Sum Test for relative positioning of REF versus ALT alleles within reads |
| SampleList | List samples that are non-reference at a given site |
| SnpEff | Top effect from SnpEff functional predictions |
| SpanningDeletions | Fraction of reads containing spanning deletions |
| StrandAlleleCountsBySample | Number of forward and reverse reads that support each allele |
| StrandBiasBySample | Number of forward and reverse reads that support REF and ALT alleles |
| StrandOddsRatio | Strand bias estimated by the Symmetric Odds Ratio test |
| TandemRepeatAnnotator | Tandem repeat unit composition and counts per allele |
| TransmissionDisequilibriumTest | Wittkowski transmission disequilibrium test |
| VariantType | General category of variant |
總結
以上是生活随笔為你收集整理的GATK之VariantAnnotator的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Windows 7 SID 修改
- 下一篇: 对比特币勒索病毒进行批量安装永恒之蓝补丁