Package picard.analysis.replicates
Class IndependentReplicateMetric
- java.lang.Object
-
- htsjdk.samtools.metrics.MetricBase
-
- picard.analysis.MergeableMetricBase
-
- picard.analysis.replicates.IndependentReplicateMetric
-
public class IndependentReplicateMetric extends MergeableMetricBase
A class to store information relevant for biological rate estimation
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class picard.analysis.MergeableMetricBase
MergeableMetricBase.MergeByAdding, MergeableMetricBase.MergeByAssertEquals, MergeableMetricBase.MergingIsManual, MergeableMetricBase.NoMergingIsDerived, MergeableMetricBase.NoMergingKeepsValue
-
-
Field Summary
Fields Modifier and Type Field Description DoublebiSiteHeterogeneityRatethe rate of heterogeneity within doubleton sets.DoublebiSiteHomogeneityRatethe rate of homogeneity within doubleton sets.DoubleindependentReplicationRateFromBiDupsDoubleindependentReplicationRateFromTriDupsThe biological duplication rate (as a fraction of the duplicates sets) calculated from tripleton sets.DoubleindependentReplicationRateFromUmiGiven the UMIs one can estimate the rate of biological duplication directly, as this would be the rate of having different UMIs in all duplicate sets.IntegernAlternateAllelesBiDupsThe number of doubletons where the two reads matched the alternate.IntegernAlternateAllelesTriDupsThe number of tripletons where the two reads matched the alternate.IntegernAlternateReadsThe number of alternate alleles in the reads.IntegernBadBarcodesThe number of sets where the UMIs had poor quality bases and were not used for any comparisons.IntegernDifferentAllelesBiDupsThe number of doubletons where the two reads had different bases in the locus.IntegernDifferentAllelesTriDupsThe number of tripletons where at least one of the reads didn't match either allele of the het site.IntegernDuplicateSetsThe number of duplicate sets examined.IntegernExactlyDoubleThe number of sets of size exactly 2 found.IntegernExactlyTripleThe number of sets of size exactly 3 found.IntegernGoodBarcodesthe number of sets where the UMIs had good quality bases and were used for any comparisons.IntegernMatchingUMIsInDiffBiDupsThe number of UMIs that are match within Bi-sets that come from different alleles.IntegernMatchingUMIsInSameBiDupsThe number of UMIs that are match within Bi-sets that come from the same alleles.IntegernMismatchingAllelesBiDupsThe number of tripletons where the two reads had different bases in the locus.IntegernMismatchingAllelesTriDupsThe number of tripletons where at least one of the reads didn't match either allele of the het site.IntegernMismatchingUMIsInContraOrientedBiDupsThe number of bi-sets with mismatching UMIs and opposite orientation.IntegernMismatchingUMIsInCoOrientedBiDupsThe number of bi-sets with mismatching UMIs and same orientation.IntegernMismatchingUMIsInDiffBiDupsThe number of UMIs that are different within Bi-sets that come from different alleles.IntegernMismatchingUMIsInSameBiDupsThe number of UMIs that are different within Bi-sets that come from the same alleles.IntegernReadsInBigSetsThe number of reads in duplicate of sizes greater than 3.IntegernReferenceAllelesBiDupsThe number of doubletons where the two reads matched the reference.IntegernReferenceAllelesTriDupsThe number of tripletons where the two reads matched the reference.IntegernReferenceReadsThe number of reference alleles in the reads.IntegernSitesThe count of sites used.IntegernThreeAllelesSitesThe count of sites in which a third allele was found.IntegernTotalReadsThe total number of reads over the het sites.DoublepSameAlleleWhenMismatchingUmiWhen the UMIs mismatch, we expect about the same number of different alleles as the same (assuming that different UMI implies biological duplicate) thus, this value should be near 0.5DoublepSameUmiInIndependentBiDupWhen the alleles are different, we know that this is a biological duplication, thus we expect nearly all the UMIs to be different (allowing for equality due to chance).DoublereplicationRateFromReplicateSetsAn estimate of the duplication rate that is based on the duplicate sets we observed.DoubletriSiteHeterogeneityRatethe rate of heterogeneity within tripleton setsDoubletriSiteHomogeneityRatethe rate of homogeneity within tripleton sets.
-
Constructor Summary
Constructors Constructor Description IndependentReplicateMetric()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcalculateDerivedFields()Placeholder method that will calculate the derived fields from the other ones.-
Methods inherited from class picard.analysis.MergeableMetricBase
canMerge, merge, merge, mergeIfCan
-
-
-
-
Field Detail
-
nSites
public Integer nSites
The count of sites used.
-
nThreeAllelesSites
public Integer nThreeAllelesSites
The count of sites in which a third allele was found.
-
nTotalReads
public Integer nTotalReads
The total number of reads over the het sites.
-
nDuplicateSets
public Integer nDuplicateSets
The number of duplicate sets examined.
-
nExactlyTriple
public Integer nExactlyTriple
The number of sets of size exactly 3 found.
-
nExactlyDouble
public Integer nExactlyDouble
The number of sets of size exactly 2 found.
-
nReadsInBigSets
public Integer nReadsInBigSets
The number of reads in duplicate of sizes greater than 3.
-
nDifferentAllelesBiDups
public Integer nDifferentAllelesBiDups
The number of doubletons where the two reads had different bases in the locus.
-
nReferenceAllelesBiDups
public Integer nReferenceAllelesBiDups
The number of doubletons where the two reads matched the reference.
-
nAlternateAllelesBiDups
public Integer nAlternateAllelesBiDups
The number of doubletons where the two reads matched the alternate.
-
nDifferentAllelesTriDups
public Integer nDifferentAllelesTriDups
The number of tripletons where at least one of the reads didn't match either allele of the het site.
-
nMismatchingAllelesBiDups
public Integer nMismatchingAllelesBiDups
The number of tripletons where the two reads had different bases in the locus.
-
nReferenceAllelesTriDups
public Integer nReferenceAllelesTriDups
The number of tripletons where the two reads matched the reference.
-
nAlternateAllelesTriDups
public Integer nAlternateAllelesTriDups
The number of tripletons where the two reads matched the alternate.
-
nMismatchingAllelesTriDups
public Integer nMismatchingAllelesTriDups
The number of tripletons where at least one of the reads didn't match either allele of the het site.
-
nReferenceReads
public Integer nReferenceReads
The number of reference alleles in the reads.
-
nAlternateReads
public Integer nAlternateReads
The number of alternate alleles in the reads.
-
nMismatchingUMIsInDiffBiDups
public Integer nMismatchingUMIsInDiffBiDups
The number of UMIs that are different within Bi-sets that come from different alleles.
-
nMatchingUMIsInDiffBiDups
public Integer nMatchingUMIsInDiffBiDups
The number of UMIs that are match within Bi-sets that come from different alleles.
-
nMismatchingUMIsInSameBiDups
public Integer nMismatchingUMIsInSameBiDups
The number of UMIs that are different within Bi-sets that come from the same alleles.
-
nMatchingUMIsInSameBiDups
public Integer nMatchingUMIsInSameBiDups
The number of UMIs that are match within Bi-sets that come from the same alleles.
-
nMismatchingUMIsInCoOrientedBiDups
public Integer nMismatchingUMIsInCoOrientedBiDups
The number of bi-sets with mismatching UMIs and same orientation.
-
nMismatchingUMIsInContraOrientedBiDups
public Integer nMismatchingUMIsInContraOrientedBiDups
The number of bi-sets with mismatching UMIs and opposite orientation.
-
nBadBarcodes
public Integer nBadBarcodes
The number of sets where the UMIs had poor quality bases and were not used for any comparisons.
-
nGoodBarcodes
public Integer nGoodBarcodes
the number of sets where the UMIs had good quality bases and were used for any comparisons.
-
biSiteHeterogeneityRate
public Double biSiteHeterogeneityRate
the rate of heterogeneity within doubleton sets.
-
triSiteHeterogeneityRate
public Double triSiteHeterogeneityRate
the rate of heterogeneity within tripleton sets
-
biSiteHomogeneityRate
public Double biSiteHomogeneityRate
the rate of homogeneity within doubleton sets.
-
triSiteHomogeneityRate
public Double triSiteHomogeneityRate
the rate of homogeneity within tripleton sets.
-
independentReplicationRateFromBiDups
public Double independentReplicationRateFromBiDups
-
independentReplicationRateFromTriDups
public Double independentReplicationRateFromTriDups
The biological duplication rate (as a fraction of the duplicates sets) calculated from tripleton sets.
-
pSameUmiInIndependentBiDup
public Double pSameUmiInIndependentBiDup
When the alleles are different, we know that this is a biological duplication, thus we expect nearly all the UMIs to be different (allowing for equality due to chance). So we expect this to be near 1.
-
pSameAlleleWhenMismatchingUmi
public Double pSameAlleleWhenMismatchingUmi
When the UMIs mismatch, we expect about the same number of different alleles as the same (assuming that different UMI implies biological duplicate) thus, this value should be near 0.5
-
independentReplicationRateFromUmi
public Double independentReplicationRateFromUmi
Given the UMIs one can estimate the rate of biological duplication directly, as this would be the rate of having different UMIs in all duplicate sets. This is only a good estimate if the assumptions hold, for example if pSameUmiInIndependentBiDup is near 1.
-
replicationRateFromReplicateSets
public Double replicationRateFromReplicateSets
An estimate of the duplication rate that is based on the duplicate sets we observed.
-
-
Method Detail
-
calculateDerivedFields
public void calculateDerivedFields()
Description copied from class:MergeableMetricBasePlaceholder method that will calculate the derived fields from the other ones. Classes that are derived from non-trivial derived classes should consider calling super.calculateDerivedFields() as well. Fields whose value will change due to this method should be annotated withNoMergingKeepsValue.- Overrides:
calculateDerivedFieldsin classMergeableMetricBase
-
-