This directory contains all the 60-way Pecan multiple alignments corresponding to Release 114 of Ensembl (see http://www.ensembl.org for further details and credits about the Ensembl project). The set of species was: - Opossum (ASM229v1) - Elephant (loxAfr3) - Greater horseshoe bat (mRhiFer1_v1.p) - Narwhal (NGI_Narwhal_1) - Beluga whale (ASM228892v3) - Vaquita (mPhoSin1.pri) - Sperm whale (ASM283717v2) - Blue whale (mBalMus1.v2) - Yarkand deer (CEY_v1) - Goat (ARS1) - Hybrid - Bos Indicus (UOA_Brahman_1) - Domestic yak (LU_Bosgru_v3.0) - Pig (Sscrofa11.1) - Chacoan peccary (CatWag_v2_BIUU_UCD) - Arabian camel (CamDro2) - Horse (EquCab3.0) - Lion (PanLeo1.0) - Leopard (PanPar1.0) - Dingo (ASM325472v1) - Dog (ROS_Cfam_1.0) - Mouse Lemur (Mmur_3.0) - Gibbon (Nleu_3.0) - Sumatran orangutan (Susie_PABv2) - Gorilla (gorGor4) - Human (GRCh38) - Bonobo (panpan1.1) - Chimpanzee (Pan_tro_3.0) - Vervet-AGM (ChlSab1.1) - Crab-eating macaque (Macaca_fascicularis_6.0) - Macaque (Mmul_10) - Olive baboon (Panubis1.0) - White-tufted-ear marmoset (mCalJac1.pat.X) - Rabbit (OryCun2.0) - Guinea Pig (Cavpor3.0) - Ryukyu mouse (CAROLI_EIJ_v1.1) - Mouse (GRCm39) - Shrew mouse (PAHARI_EIJ_v1.1) - Prairie vole (MicOch1.0) - Chinese hamster CHOK1GS (CHOK1GS_HDv1) - Northern American deer mouse (HU_Pman_2.1) - Eurasian red squirrel (mSciVul1.1) - Alpine marmot (marMar2.1) - Platypus (mOrnAna1.p.v1) - Eastern brown snake (EBS10Xv2-PRI) - Indian cobra (Nana_v5) - Green anole (AnoCar2.0v2) - Common wall lizard (PodMur_1.0) - Argentine black and white tegu (HLtupMer3) - Australian saltwater crocodile (CroPor_comp1) - Great Tit (Parus_major1.1) - Common canary (SCA1) - Zebra finch (bTaeGut1_v1.p) - Kakapo (bStrHab1_v1.p) - Golden eagle (bAquChr1.2) - Duck (CAU_duck1.0) - Turkey (Turkey_5.1) - Chicken (bGalGal1.mat.broiler.GRCg7b) - Japanese quail (Coturnix_japonica_2.0) - Goodes thornscrub tortoise (rGopEvg1_v1.p) - Three-toed box turtle (T_m_triunguis-2.0) The species tree was: ( ( ( ( ( ( ( ( ( Serinus canaria:0.01609, Taeniopygia guttata:0.01971 ):0.00742, Parus major:0.016 ):0.03999, Strigops habroptila:0.03123 ):0.00363, Aquila chrysaetos chrysaetos:0.01652 ):0.01124, ( ( ( Gallus gallus reference breed:0.01082, Coturnix japonica:0.0192 ):0.00137, Meleagris gallopavo reference strain:0.01634 ):0.0295, Anas platyrhynchos platyrhynchos:0.03392 ):0.01143 ):0.05538, Crocodylus porosus:0.06448 ):0.01298, ( Terrapene triunguis:0.00976, Gopherus evgoodei:0.01864 ):0.04262 ):0.01135, ( ( Podarcis muralis:0.05632, Salvator merianae:0.0704 ):0.00787, ( ( Pseudonaja textilis:0.01587, Naja naja:0.01943 ):0.09114, Anolis carolinensis reference strain:0.07557 ):0.01305 ):0.07357 ):0.02923, ( ( ( ( ( ( ( ( ( ( ( Cricetulus griseus:0.02331, Microtus ochrogaster:0.02605 ):0.00176, Peromyscus maniculatus bairdii:0.02125 ):0.00606, ( ( Mus musculus reference CL57BL6 strain:0.00814, Mus caroli strain CAROLI_EIJ:0.00867 ):0.00509, Mus pahari strain PAHARI_EIJ:0.01121 ):0.02408 ):0.03858, Cavia porcellus:0.06846 ):0.00199, ( Marmota marmota marmota:0.02069, Sciurus vulgaris:0.02098 ):0.01752 ):0.0053, Oryctolagus cuniculus:0.05566 ):0.00284, ( ( ( ( ( ( ( ( Pan troglodytes:0.00168, Pan paniscus:0.00306 ):0.00278, Homo sapiens:0.00274 ):0.00104, Gorilla gorilla gorilla:0.00701 ):0.00342, Pongo abelii:0.00671 ):0.00117, Nomascus leucogenys:0.01685 ):0.00357, ( ( ( Macaca mulatta:0.00115, Macaca fascicularis:0.00533 ):0.0028, Papio anubis:0.00472 ):0.00117, Chlorocebus sabaeus:0.00518 ):0.00859 ):0.00562, Callithrix jacchus:0.02358 ):0.01713, Microcebus murinus:0.03684 ):0.00432 ):0.00439, ( ( ( ( ( ( ( ( ( Monodon monoceros:0.00197, Delphinapterus leucas:0.00236 ):0.00247, Phocoena sinus:0.0061 ):0.00841, Physeter catodon:0.01179 ):0.00166, Balaenoptera musculus:0.01245 ):0.01498, ( ( ( Bos indicus x Bos taurus:0.00251, Bos grunniens:0.01194 ):0.01023, Capra hircus reference breed:0.01093 ):0.00393, Cervus hanglu yarkandensis:0.0123 ):0.02052 ):0.00458, ( Catagonus wagneri:0.01739, Sus scrofa reference breed:0.02226 ):0.01581 ):0.00238, Camelus dromedarius:0.03147 ):0.00795, Rhinolophus ferrumequinum:0.0438 ):0.00124, ( ( ( Panthera pardus:0.00079, Panthera leo:0.00164 ):0.02393, ( Canis lupus familiaris reference breed:0.00154, Canis lupus dingo:0.00155 ):0.0263 ):0.01067, Equus caballus breed thoroughbred:0.02949 ):0.00143 ):0.00618 ):0.005, Loxodonta africana:0.05129 ):0.05693, Monodelphis domestica:0.08729 ):0.02069, Ornithorhynchus anatinus:0.11151 ):0.04523 ); First, Mercator is used to build a synteny map between the genomes and then Pecan builds alignments in these syntenic regions. Pecan is a global multiple sequence alignment program that makes practical the probabilistic consistency methodology for significant numbers of sequences of practically arbitrary length. As input it takes a set of sequences and a phylogenetic tree. The parameters and heuristics it employs are highly user configurable, it is written entirely in Java and also requires the installation of Exonerate. Read more about Pecan: https://github.com/benedictpaten/pecan GERP scores the conservation of each position in the alignment and defines constrained elements based on these conservation scores. Read more about Gerp: http://mendel.stanford.edu/SidowLab/downloads/gerp/index.html Alignments are grouped by human chromosome, and then by coordinate system. Alignments containing duplications in human are dumped once per duplicated segment. The files named *.other*.emf contain alignments that do not include any human region. Each file contains up to 200 alignments. An emf2maf parser is available with the ensembl compara API, in the scripts/dumps directory. Alternatively you can download it using the GitHub frontend: https://github.com/Ensembl/ensembl-compara/raw/master/scripts/dumps/emf2maf.pl