Package picard.analysis
Class CollectWgsMetrics
- java.lang.Object
-
- picard.cmdline.CommandLineProgram
-
- picard.analysis.CollectWgsMetrics
-
- Direct Known Subclasses:
CollectRawWgsMetrics,CollectWgsMetricsWithNonZeroCoverage
@DocumentedFeature public class CollectWgsMetrics extends CommandLineProgram
Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing experiments. Two algorithms are available for this metrics: default and fast. The fast algorithm is enabled by USE_FAST_ALGORITHM option. The fast algorithm works better for regions of BAM file with coverage at least 10 reads per locus, for lower coverage the algorithms perform the same.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classCollectWgsMetrics.CollectWgsMetricsIntervalArgumentCollectionprotected static classCollectWgsMetrics.WgsMetricsCollector
-
Field Summary
Fields Modifier and Type Field Description List<Double>ALLELE_FRACTIONbooleanCOUNT_UNPAIREDintCOVERAGE_CAPbooleanINCLUDE_BQ_HISTOGRAMFileINPUTprotected IntervalArgumentCollectionintervalArgumentCollectionprotected FileINTERVALSintLOCUS_ACCUMULATION_CAPintMINIMUM_BASE_QUALITYintMINIMUM_MAPPING_QUALITYFileOUTPUTintREAD_LENGTHintSAMPLE_SIZElongSTOP_AFTERFileTHEORETICAL_SENSITIVITY_OUTPUTbooleanUSE_FAST_ALGORITHM-
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_ALLOWABLE_ONE_LINE_SUMMARY_LENGTH, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
-
Constructor Summary
Constructors Constructor Description CollectWgsMetrics()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected intdoWork()Do the work after command line has been parsed.protected WgsMetricsgenerateWgsMetrics(htsjdk.samtools.util.IntervalList intervals, htsjdk.samtools.util.Histogram<Integer> highQualityDepthHistogram, htsjdk.samtools.util.Histogram<Integer> unfilteredDepthHistogram, double pctExcludedByAdapter, double pctExcludedByMapq, double pctExcludedByDupes, double pctExcludedByPairing, double pctExcludedByBaseq, double pctExcludedByOverlap, double pctExcludedByCapping, double pctTotal, int coverageCap, htsjdk.samtools.util.Histogram<Integer> unfilteredBaseQHistogram, int theoreticalHetSensitivitySampleSize)protected longgetBasesExcludedBy(CountingFilter filter)If INTERVALS is specified, this will count bases beyond the interval list when the read overlaps the intervals and extends beyond the edge.protected AbstractWgsMetricsCollectorgetCollector(int coverageCap, htsjdk.samtools.util.IntervalList intervals)CreatesAbstractWgsMetricsCollectorimplementation according toUSE_FAST_ALGORITHMvalue.protected htsjdk.samtools.util.IntervalListgetIntervalsToExamine()Gets the intervals over which we will calculate metrics.protected htsjdk.samtools.util.AbstractLocusIteratorgetLocusIterator(htsjdk.samtools.SamReader in)CreatesAbstractLocusIteratorimplementation according toUSE_FAST_ALGORITHMvalue.protected htsjdk.samtools.SAMFileHeadergetSamFileHeader()This method should only be called aftergetSamReader()is called.protected htsjdk.samtools.SamReadergetSamReader()Gets the SamReader from which records will be examined.protected IntervalArgumentCollectionmakeIntervalArgumentCollection()protected booleanrequiresReference()-
Methods inherited from class picard.cmdline.CommandLineProgram
checkRInstallation, customCommandLineValidation, getCommandLine, getCommandLineParser, getCommandLineParserForArgs, getDefaultHeaders, getFaqLink, getMetricsFile, getPGRecord, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, setDefaultHeaders, useLegacyParser
-
-
-
-
Field Detail
-
INPUT
@Argument(shortName="I", doc="Input SAM/BAM/CRAM file.") public File INPUT
-
OUTPUT
@Argument(shortName="O", doc="Output metrics file.") public File OUTPUT
-
MINIMUM_MAPPING_QUALITY
@Argument(shortName="MQ", doc="Minimum mapping quality for a read to contribute coverage.") public int MINIMUM_MAPPING_QUALITY
-
MINIMUM_BASE_QUALITY
@Argument(shortName="Q", doc="Minimum base quality for a base to contribute coverage. N bases will be treated as having a base quality of negative infinity and will therefore be excluded from coverage regardless of the value of this parameter.") public int MINIMUM_BASE_QUALITY
-
COVERAGE_CAP
@Argument(shortName="CAP", doc="Treat positions with coverage exceeding this value as if they had coverage at this value (but calculate the difference for PCT_EXC_CAPPED).") public int COVERAGE_CAP
-
LOCUS_ACCUMULATION_CAP
@Argument(doc="At positions with coverage exceeding this value, completely ignore reads that accumulate beyond this value (so that they will not be considered for PCT_EXC_CAPPED). Used to keep memory consumption in check, but could create bias if set too low") public int LOCUS_ACCUMULATION_CAP
-
STOP_AFTER
@Argument(doc="For debugging purposes, stop after processing this many genomic bases.") public long STOP_AFTER
-
INCLUDE_BQ_HISTOGRAM
@Argument(doc="Determines whether to include the base quality histogram in the metrics file.") public boolean INCLUDE_BQ_HISTOGRAM
-
COUNT_UNPAIRED
@Argument(doc="If true, count unpaired reads, and paired reads with one end unmapped") public boolean COUNT_UNPAIRED
-
SAMPLE_SIZE
@Argument(doc="Sample Size used for Theoretical Het Sensitivity sampling. Default is 10000.", optional=true) public int SAMPLE_SIZE
-
intervalArgumentCollection
@ArgumentCollection protected IntervalArgumentCollection intervalArgumentCollection
-
THEORETICAL_SENSITIVITY_OUTPUT
@Argument(doc="Output for Theoretical Sensitivity metrics.", optional=true) public File THEORETICAL_SENSITIVITY_OUTPUT
-
ALLELE_FRACTION
@Argument(doc="Allele fraction for which to calculate theoretical sensitivity.", optional=true) public List<Double> ALLELE_FRACTION
-
USE_FAST_ALGORITHM
@Argument(doc="If true, fast algorithm is used.") public boolean USE_FAST_ALGORITHM
-
READ_LENGTH
@Argument(doc="Average read length in the file. Default is 150.", optional=true) public int READ_LENGTH
-
INTERVALS
protected File INTERVALS
-
-
Method Detail
-
requiresReference
protected boolean requiresReference()
- Overrides:
requiresReferencein classCommandLineProgram
-
makeIntervalArgumentCollection
protected IntervalArgumentCollection makeIntervalArgumentCollection()
- Returns:
- An interval argument collection to be used for this tool. Subclasses can override this to provide an argument collection with alternative arguments or argument annotations.
-
getSamReader
protected htsjdk.samtools.SamReader getSamReader()
Gets the SamReader from which records will be examined. This will also set the header so that it is available in
-
doWork
protected int doWork()
Description copied from class:CommandLineProgramDo the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.- Specified by:
doWorkin classCommandLineProgram- Returns:
- program exit status.
-
getIntervalsToExamine
protected htsjdk.samtools.util.IntervalList getIntervalsToExamine()
Gets the intervals over which we will calculate metrics.
-
getSamFileHeader
protected htsjdk.samtools.SAMFileHeader getSamFileHeader()
This method should only be called aftergetSamReader()is called.
-
generateWgsMetrics
protected WgsMetrics generateWgsMetrics(htsjdk.samtools.util.IntervalList intervals, htsjdk.samtools.util.Histogram<Integer> highQualityDepthHistogram, htsjdk.samtools.util.Histogram<Integer> unfilteredDepthHistogram, double pctExcludedByAdapter, double pctExcludedByMapq, double pctExcludedByDupes, double pctExcludedByPairing, double pctExcludedByBaseq, double pctExcludedByOverlap, double pctExcludedByCapping, double pctTotal, int coverageCap, htsjdk.samtools.util.Histogram<Integer> unfilteredBaseQHistogram, int theoreticalHetSensitivitySampleSize)
-
getBasesExcludedBy
protected long getBasesExcludedBy(CountingFilter filter)
If INTERVALS is specified, this will count bases beyond the interval list when the read overlaps the intervals and extends beyond the edge. Ideally INTERVALS should only include regions that have hard edges without reads that could extend beyond the boundary (such as a whole contig).
-
getLocusIterator
protected htsjdk.samtools.util.AbstractLocusIterator getLocusIterator(htsjdk.samtools.SamReader in)
CreatesAbstractLocusIteratorimplementation according toUSE_FAST_ALGORITHMvalue.- Parameters:
in- innerSamReader- Returns:
- if
USE_FAST_ALGORITHMis enabled, returnsEdgeReadIteratorimplementation, otherwise default algorithm is used andSamLocusIteratoris returned.
-
getCollector
protected AbstractWgsMetricsCollector getCollector(int coverageCap, htsjdk.samtools.util.IntervalList intervals)
CreatesAbstractWgsMetricsCollectorimplementation according toUSE_FAST_ALGORITHMvalue.- Parameters:
coverageCap- the maximum depth/coverage to consider.intervals- the intervals over which metrics are collected.- Returns:
- if
USE_FAST_ALGORITHMis enabled, returnsFastWgsMetricsCollectorimplementation, otherwise default algorithm is used andCollectWgsMetrics.WgsMetricsCollectoris returned.
-
-