Package vcf
Class VcfRec
java.lang.Object
vcf.VcfRec
- All Implemented Interfaces:
IntArray,DuplicatesGTRec,GTRec,MarkerContainer
Class VcfRec represents a VCF record. If one allele in a
diploid genotype is missing, then both alleles are set to missing.
Instances of class VcfRec are immutable.
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionintallele1(int sample) Returns the first allele for the specified sample or -1 if the allele is missing.intallele2(int sample) Returns the second allele for the specified sample or -1 if the allele is missing.int[]alleles()Returns an array of lengththis.size()whosej-th element is equal tothis.allele(j}filter()Returns the FILTER field.format()Returns the FORMAT field.String[]formatData(String formatCode) Returns an array of lengththis.size()containing the specified FORMAT subfield data for each sample.intformatIndex(String formatCode) Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.formatSubfield(int subfieldIndex) Returns the specified FORMAT subfield.static VcfRecConstructs and returns a newVcfRecinstance from a VCF record and its GL or PL format subfield data.static VcfRecConstructs and returns a newVcfRecinstance from a VCF record and its GT format subfield datastatic VcfRecConstructs and returns a newVcfRecinstance from a VCF record and its GT, GL, and PL format subfield data.intget(int hap) Returns the specified allele for the specified haplotype or -1 if the allele is missing.floatgl(int sample, int allele1, int allele2) Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.static intgtIndex(int a1, int a2) Returns the VCF genotype index for the specified pair of alleles.booleanReturnstrueif the specified FORMAT subfield is present, and returnsfalseotherwise.info()Returns the INFO field.booleanisPhased()Returnstrueif every genotype for each sample is a phased, non-missing genotype, and returnsfalseotherwise.booleanisPhased(int sample) Returnstrueif the genotype for the specified sample has non-missing alleles and is either haploid or diploid with a phased allele separator, and returnsfalseotherwise.marker()Returns the marker.intReturns the number of FORMAT subfields.qual()Returns the QUAL field.sampleData(int sample) Returns the data for the specified sample.sampleData(int sample, int subfieldIndex) Returns the specified data for the specified sample.sampleData(int sample, String formatCode) Returns the specified data for the specified sample.samples()Returns the list of samples.intsize()Returns the number of haplotypes.toString()Returns the VCF record.Returns the VCF meta-information lines and the VCF header line.
-
Field Details
-
GL_FORMAT
The VCF FORMAT code for log-scaled genotype likelihood data: "GL".- See Also:
-
PL_FORMAT
The VCF FORMAT code for phred-scaled genotype likelihood data: "PL".- See Also:
-
-
Method Details
-
gtIndex
public static int gtIndex(int a1, int a2) Returns the VCF genotype index for the specified pair of alleles.- Parameters:
a1- the first allelea2- the second allele- Returns:
- the VCF genotype index for the specified pair of alleles
- Throws:
IllegalArgumentException- ifa1 < 0 || a2 < 0
-
fromGT
Constructs and returns a newVcfRecinstance from a VCF record and its GT format subfield data- Parameters:
vcfHeader- meta-information lines and header line for the specified VCF record.vcfRecord- a VCF record with a GL format field corresponding to the specifiedvcfHeaderobject- Returns:
- a new
VcfRecinstance - Throws:
IllegalArgumentException- if the VCF record does not have a GT format fieldIllegalArgumentException- if a VCF record format error is detectedIllegalArgumentException- if there are notvcfHeader.nHeaderFields()tab-delimited fields in the specified VCF recordNullPointerException- ifvcfHeader == null || vcfRecord == null
-
fromGL
Constructs and returns a newVcfRecinstance from a VCF record and its GL or PL format subfield data. If both GL and PL format subfields are present, the GL format field will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThresholdis set to 0.- Parameters:
vcfHeader- meta-information lines and header line for the specified VCF recordvcfRecord- a VCF record with a GL format field corresponding to the specifiedvcfHeaderobjectmaxLR- the maximum likelihood ratio- Returns:
- a new
VcfRecinstance - Throws:
IllegalArgumentException- if the VCF record does not have a GL format fieldIllegalArgumentException- if a VCF record format error is detectedIllegalArgumentException- if there are notvcfHeader.nHeaderFields()tab-delimited fields in the specified VCF recordNullPointerException- ifvcfHeader == null || vcfRecord == null
-
fromGTGL
Constructs and returns a newVcfRecinstance from a VCF record and its GT, GL, and PL format subfield data. If the GT format subfield is present and non-missing, the GT format subfield is used to determine genotype likelihoods. Otherwise the GL or PL format subfield is used to determine genotype likelihoods. If both the GL and PL format subfields are present, only the GL format subfield will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThresholdis set to 0.- Parameters:
vcfHeader- meta-information lines and header line for the specified VCF recordvcfRecord- a VCF record with a GT, a GL or a PL format field corresponding to the specifiedvcfHeaderobjectmaxLR- the maximum likelihood ratio- Returns:
- a new
VcfRec - Throws:
IllegalArgumentException- if the VCF record does not have a GT, GL, or PL format fieldIllegalArgumentException- if a VCF record format error is detectedIllegalArgumentException- if there are notvcfHeader.nHeaderFields()tab-delimited fields in the specified VCF recordNullPointerException- ifvcfHeader == null || vcfRecord == null
-
qual
Returns the QUAL field.- Returns:
- the QUAL field
-
filter
Returns the FILTER field.- Returns:
- the FILTER field
-
info
Returns the INFO field.- Returns:
- the INFO field
-
format
Returns the FORMAT field. Returns the empty string ("") if the FORMAT field is missing.- Returns:
- the FORMAT field
-
nFormatSubfields
public int nFormatSubfields()Returns the number of FORMAT subfields.- Returns:
- the number of FORMAT subfields
-
formatSubfield
Returns the specified FORMAT subfield.- Parameters:
subfieldIndex- a FORMAT subfield index- Returns:
- the specified FORMAT subfield
- Throws:
IndexOutOfBoundsException- ifsubfieldIndex < 0 || subfieldIndex >= this.nFormatSubfields()
-
hasFormat
Returnstrueif the specified FORMAT subfield is present, and returnsfalseotherwise.- Parameters:
formatCode- a FORMAT subfield code- Returns:
trueif the specified FORMAT subfield is present
-
formatIndex
Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.- Parameters:
formatCode- the format subfield code- Returns:
- the index of the specified FORMAT subfield if the
specified subfield is defined for this VCF record, and
-1otherwise
-
sampleData
Returns the data for the specified sample.- Parameters:
sample- a sample index- Returns:
- the data for the specified sample
- Throws:
IndexOutOfBoundsException- ifsample < 0 || sample >= this.size()
-
sampleData
Returns the specified data for the specified sample.- Parameters:
sample- a sample indexformatCode- a FORMAT subfield code- Returns:
- the specified data for the specified sample
- Throws:
IllegalArgumentException- ifthis.hasFormat(formatCode)==falseIndexOutOfBoundsException- ifsample < 0 || sample >= this.size()
-
sampleData
Returns the specified data for the specified sample.- Parameters:
sample- a sample indexsubfieldIndex- a FORMAT subfield index- Returns:
- the specified data for the specified sample
- Throws:
IndexOutOfBoundsException- iffield < 0 || field >= this.nFormatSubfields()IndexOutOfBoundsException- ifsample < 0 || sample >= this.size()
-
formatData
Returns an array of lengththis.size()containing the specified FORMAT subfield data for each sample. Thek-th element of the array is the specified FORMAT subfield data for thek-th sample.- Parameters:
formatCode- a format subfield code- Returns:
- an array of length
this.size()containing the specified FORMAT subfield data for each sample - Throws:
IllegalArgumentException- ifthis.hasFormat(formatCode) == false
-
samples
Description copied from interface:GTRecReturns the list of samples. -
vcfHeader
Returns the VCF meta-information lines and the VCF header line.- Returns:
- the VCF meta-information lines and the VCF header line
-
marker
Description copied from interface:MarkerContainerReturns the marker.- Specified by:
markerin interfaceMarkerContainer- Returns:
- the marker
-
allele1
public int allele1(int sample) Description copied from interface:DuplicatesGTRecReturns the first allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false.- Specified by:
allele1in interfaceDuplicatesGTRec- Parameters:
sample- a sample index- Returns:
- the first allele for the specified sample
-
allele2
public int allele2(int sample) Description copied from interface:DuplicatesGTRecReturns the second allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false.- Specified by:
allele2in interfaceDuplicatesGTRec- Parameters:
sample- a sample index- Returns:
- the second allele for the specified sample
-
get
public int get(int hap) Description copied from interface:DuplicatesGTRecReturns the specified allele for the specified haplotype or -1 if the allele is missing. The two alleles for a sample at a marker are arbitrarily ordered ifthis.unphased(marker, hap/2) == false.- Specified by:
getin interfaceDuplicatesGTRec- Specified by:
getin interfaceIntArray- Parameters:
hap- a haplotype index- Returns:
- the specified allele for the specified sample
-
alleles
public int[] alleles()Description copied from interface:DuplicatesGTRecReturns an array of lengththis.size()whosej-th element is equal tothis.allele(j}- Specified by:
allelesin interfaceDuplicatesGTRec- Returns:
- an array of length
this.size()whosej-th element is equal tothis.allele(j}
-
isPhased
public boolean isPhased(int sample) Description copied from interface:DuplicatesGTRecReturnstrueif the genotype for the specified sample has non-missing alleles and is either haploid or diploid with a phased allele separator, and returnsfalseotherwise.- Specified by:
isPhasedin interfaceDuplicatesGTRec- Parameters:
sample- a sample index- Returns:
trueif the genotype for the specified sample is a phased, nonmissing genotype
-
isPhased
public boolean isPhased()Description copied from interface:DuplicatesGTRecReturnstrueif every genotype for each sample is a phased, non-missing genotype, and returnsfalseotherwise.- Specified by:
isPhasedin interfaceDuplicatesGTRec- Returns:
trueif the genotype for each sample is a phased, non-missing genotype
-
gl
public float gl(int sample, int allele1, int allele2) Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype. Returns1.0fif the corresponding genotype determined by theisPhased(),allele1(), andallele2()methods is consistent with the specified ordered genotype, and returns0.0fotherwise.- Parameters:
sample- the sample indexallele1- the first allele indexallele2- the second allele index- Returns:
- the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.
- Throws:
IndexOutOfBoundsException- ifsamples < 0 || samples >= this.size()IndexOutOfBoundsException- ifallele1 < 0 || allele1 >= this.marker().nAlleles()IndexOutOfBoundsException- ifallele2 < 0 || allele2 >= this.marker().nAlleles()
-
size
public int size()Description copied from interface:DuplicatesGTRecReturns the number of haplotypes.- Specified by:
sizein interfaceDuplicatesGTRec- Specified by:
sizein interfaceIntArray- Returns:
- the number of haplotypes
-
toString
Returns the VCF record.
-