This directory contains gzip compressed GVF (Genome Variation Format) files: 

    From release/93 we dump data per chromosome (1-22, MT, X, Y) for:
    homo_sapiens-chr*.gvf.gz 
        All germline variations from the current Ensembl release for this
        species
    homo_sapiens_incl_consequences-chr*.gvf.gz
        All consequences of the variations on the Ensembl transcriptome,
        as called by the variation consequence pipeline

    homo_sapiens_structural_variations.gvf.gz
        All structural variations (if available for this species)
    homo_sapiens_failed.gvf.gz
        Any variations that have been failed by the Ensembl QC checks
    homo_sapiens_somatic.gvf.gz
        All somatic mutations from the current Ensembl release.
    homo_sapiens_somatic_incl_consequences.gvf.gz
        All consequences of somatic mutations on the Ensembl transcriptome,
        as called by the variation consequence pipeline
    homo_sapiens_phenotype_associated.gvf.gz
        All variations from the current Ensembl release that have been
        associated with a phenotype
    homo_sapiens_clinically_associated.gvf.gz
        All variations from the current Ensembl release that have been
        described by ClinVar as being probable-pathogenic, pathogenic,
        drug-response or histocompatibility

Additionally, we provide for human:
    - 1000GENOMES-phase_3.gvf.gz containing allele frequencies from 1000
      genomes phase 3 populations
    - NOTE:
        - files containing allele frequencies from several of the HapMap
          populations have been discontinued and are available from our archive sites,
          the latest being https://ftp.ensembl.org/pub/release-97/variation/gvf/homo_sapiens/
        - files containing allele frequencies from populations from the
          Exome Sequencing Project have been discontinued and are available
          from our archive sites, the latest being https://ftp.ensembl.org/pub/release-98/variation/gvf/homo_sapiens/

If available for this species, the file includes information on:
    - ancestral_allele
    - evidence
    - clinical_significance
    - global minor allele, frequency and count
Incl_consequences files include sift (if available for this species)
and polyphen (human only) predictions.


The data contained in these files is presented in GVF format, this is a
simple tab-delimited format derived from GFF3 which shows the location of
each variant along with the reference and variant sequences, an identifier
for the source of the data (typically a dbSNP rsID), and other relevant
information (e.g. genotypes, allele frequencies, the predicted effect of
this variant on a transcript), a short example is presented below. For
more details about GVF please refer to:

Reese, M.G. et al. A standard variation file format for human genome sequences.
Genome Biology. 2010;11(8):R88 PMID: 20796305

and:

https://github.com/The-Sequence-Ontology/Specifications/blob/master/gvf.md

We use the sum command for calculating checksums.

Questions about these files can be addressed to the Ensembl helpdesk:
helpdesk@ensembl.org, or to the developer's mailing list: dev@ensembl.org.

-----

Example content from the human germline GVF dump is shown below:

##gff-version 3
##gvf-version 1.07
##file-date 2014-07-13
##genome-build ensembl GRCh38
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606
##feature-ontology http://song.cvs.sourceforge.net/viewvc/song/ontology/so.obo?revision=1.283
##data-source Source=ensembl;version=76;url=http://e76.ensembl.org/Homo_sapiens
##file-version 76
##sequence-region 8 1 145138636
8       dbSNP   SNV     60059   60059   .       +       .       ID=1;Variant_seq=T;Dbxref=dbSNP_138:rs371829072;Reference_seq=C
8       dbSNP   SNV     60211   60211   .       +       .       ID=2;Variant_seq=T;Dbxref=dbSNP_138:rs376064598;Reference_seq=G
8       dbSNP   SNV     60220   60220   .       +       .       ID=3;Variant_seq=A;Dbxref=dbSNP_138:rs368575943;Reference_seq=G
8       dbSNP   SNV     60251   60251   .       +       .       ID=4;Variant_seq=T;Dbxref=dbSNP_138:rs372357503;Reference_seq=C
8       dbSNP   SNV     60288   60288   .       +       .       ID=5;Variant_seq=G;Dbxref=dbSNP_138:rs375561901;Reference_seq=C
8       dbSNP   SNV     60290   60290   .       +       .       ID=6;Variant_seq=C;evidence_values=Multiple_observations;Dbxref=dbSNP_138:rs200947342;Reference_seq=A
8       dbSNP   SNV     60323   60323   .       +       .       ID=7;Variant_seq=G;Dbxref=dbSNP_138:rs199540500;Reference_seq=C
8       dbSNP   SNV     60341   60341   .       +       .       ID=8;Variant_seq=G;evidence_values=Multiple_observations;Dbxref=dbSNP_138:rs201908809;Reference_seq=C
8       dbSNP   SNV     60346   60346   .       +       .       ID=9;Variant_seq=G;Dbxref=dbSNP_138:rs78893626;Reference_seq=A