README file for the path ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/xml/clinvar_variation/beta/ This directory contains a beta release of XML for Variation ClinVar accessions, or VCVs. March 27, 2018 For these two elements: /ClinVarVariationRelease/VariationArchive/ReplacedList/Replaced /ClinVarVariationRelease/VariationArchive/ReplacedBy/ the "Description" element was renamed to "Comment" and uses CommentType "MergeComment". Known issues: - Some MeasureSet types are not currently supported, including Phase unknown and Distinct chromosomes - Some haplotypes are reported with multiple SimpleAllele elements but no Haplotype element - Some Gene elements do not have @Symbol or @FullName attributes January 3, 2018 - The XRefList element under SimpleAllele, Haplotype, and Genotype now uses specific Types for ClinVar Variation IDs, to indicate whether the Variation ID represents an Included or an Interpreted record. Known issues: - Some MeasureSet types are not currently supported, including Phase unknown and Distinct chromosomes - Some haplotypes are reported with multiple SimpleAllele elements but no Haplotype element - Some Gene elements do not have @Symbol or @FullName attributes November 28, 2017 - Submission name was moved from an attribute on ClinicalAssertion: /VariationArchive/InterpretedRecord/ClinicalAssertionList/ClinicalAssertion/@SubmissionName to a list of SubmissionNames: /VariationArchive/InterpretedRecord/ClinicalAssertionList/ClinicalAssertion/SubmissionNameList/SubmissionName - Records with VariationType 'Phase unknown' were added. Known issues: - Some MeasureSet types are not currently supported, including Distinct chromosomes - Some haplotypes are reported with multiple SimpleAllele elements but no Haplotype element - Some Gene elements do not have @Symbol or @FullName attributes November 14, 2017 - the ClinVarAccession element /VariationArchive/InterpretedRecord/RCVList/ClinVarAccession was renamed to /VariationArchive/InterpretedRecord/RCVList/RCVAccession - the attribute @InterpretedCondition on the element /VariationArchive/InterpretedRecord/RCVList/ClinVarAccession was changed to an element within the InterpretedConditionList: /VariationArchive/InterpretedRecord/RCVList/RCVAccession/InterpretedConditionList/InterpretedCondition with a new type, typeRCVInterpretedCondition - the attribute @Version on the element /VariationArchive/InterpretedRecord/RCVList/ClinVarAccession was changed from required to optional. - the VariationID element /VariationArchive/IncludedRecord/InterpretedVariationList/VariationID was changed to an attribute, @VariationID, on InterpretedVariation: /VariationArchive/IncludedRecord/InterpretedVariationList/InterpretedVariation InterpretedVariation also has attributes for Accession and Version. - the SCV element /VariationArchive/IncludedRecord/SubmittedInterpretationList/SCV has two additional attributes, @Accession and @Version - the Condition element /VariationArchive/InterpretedRecord/Interpretations/ConditionList/Condition was changed to /VariationArchive/InterpretedRecord/Interpretations/ConditionList/ConditionList/TraitSet with type ClinAsserTraitSetType - the VariationName attribute on the VariationArchive element was changed from optional to required. - attributes for @OrgID, @OrganizationCategory, @SubmitterName, and OrgAbbreviation were removed from the Clinical Assertion element /VariationArchive/InterpretedRecord/ClinicalAssertionList/ClinicalAssertion attribute: OrgID These attributes remain on the ClinVarAccession element /VariationArchive/InterpretedRecord/ClinicalAssertionList/ClinicalAssertion/ClinVarAccession Known issues: - Some MeasureSet types are not currently supported, including Phase unknown and Distinct chromosomes - Some haplotypes are reported with multiple SimpleAllele elements but no Haplotype element - Some Gene elements do not have @Symbol or @FullName attributes Last updated July 19, 2017 This directory contains a beta release of XML for Variation ClinVar accessions, or VCVs. File names will be constructed in the format of ClinVarVariationRelease_YYYY-MMDD.xml.gz where YYYY, MM, and DD are the year, month and day the file was created. Data in ClinVarVariationRelease is aggregated by the VariationID, which represents the variant or set of variants that were interpreted for clinical or functional significance, or that were components of such interpretations. This aggregation of data is assigned an accession number, with the prefix VCV (Variation in ClinVar) followed by nine digits. The digits correspond to the same Variation ID reported on ClinVar's web site and ClinVarFullRelease files, padded with preceding zeros. The accessions are versioned, with versions incremented when new or updated submissions are processed for the same VariationID. In the beta release, all VCV accesions will have version 1; the version number will not increment until the production release. This file has been constructed to make it easier for users who want to access all data for a variant or set of variants, not separated by the diseases for which they have been interpreted. The content is expected to be equivalent to data in ClinVarFullRelease, just organized differently. As with ClinVarFullRelease, some content in ClinVarVariationRelease is aggregated across all information in ClinVar for the same VariationID, while other elements, namely the /ClinVarVariationRelease/VariationArchive/InterpretedRecord/ClinicalAssertionList path, represent contributions from each submission. Please note: consistency checks between ClinVarFullRelease and ClinVarVariationRelease are still being polished, so in the beta phase there may be some differences in content. We will retain ClinVarFullRelease, the archive of ClinVar data aggregated by accessions beginning with RCV, corresponding to a variant-disease pair. Updates to ClinVarVariationRelease in the beta phase will be done irregularly and as needed, in response to development updates and bug reports. Updates to ClinVarVariationRelease will use the same snapshot of data as the weekly update for ClinVarFullRelease. Features in ClinVarVariationRelease include: - explicit elements to distinguish records for simple alleles vs. haplotypes vs. genotypes - explicit elements to distinguish between variants that were directly interpreted vs. variants that were interpreted only as part of a haplotype or genotype (i.e. "included" variants). The clinical significance for included variants is indicated as "no interpretation for the single variant". Some features are not yet included in ClinVarVariationRelease but will be added before the production release: - a history indicating accessions that were merged into the current accession (Replaces element) - a section to map the submitted name or identifier for the interpreted condition to the corresponding name used in ClinVar and MedGen CUI - a complementary file of deleted VCV accessions - certain types of variant sets are not yet included in the release: diplotypes, phase unknown, different chromosomes We anticipate that beta release will last six weeks; after that, we will move into a production mode. During the beta release, we ask our XML users to review the file and send feedback and error reports to clinvar@ncbi.nlm.nih.gov. See also: * the XSD for ClinVarVariationRelease: ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/xml/clinvar_variation/beta/variation_archive.xsd When in production mode, variation_archive.xsd will be versioned and provided from ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/xsd_public * the updated ClinVar Data Dictionary: https://www.ncbi.nlm.nih.gov/projects/clinvar/ClinVarDataDictionary.pdf