# molecular_function # cellular_component 20150324: Eukaryota_PTN000520178 located in Ski complex (GO:0055087) 20150324: Eukaryota_PTN001107956 located in t-UTP complex (GO:0034455) # biological_process 20150324: Saccharomycetales_PTN001108026 lost/modified positive regulation of transcription elongation from RNA polymerase II promoter (GO:0032968) capacity 20150324: Saccharomycetales_PTN001108026 lost/modified positive regulation of histone methylation (GO:0031062) capacity 20150324: Eukaryota_PTN000520178 participates in nuclear-transcribed mRNA catabolic process, exonucleolytic, 3'-5' (GO:0034427) 20150324: Eukaryota_PTN000520178 participates in positive regulation of histone methylation (GO:0031062) 20150324: Eukaryota_PTN000520178 participates in positive regulation of transcription elongation from RNA polymerase II promoter (GO:0032968) 20150324: Eukaryota_PTN001107956 participates in maturation of SSU-rRNA (GO:0030490) 20150324: Eukaryota_PTN001107956 participates in regulation of transcription, DNA-templated (GO:0006355) # Pruned from tree 20150324: Pruned Eukaryota_PTN001108000 # Notes This tree has a duplication node at the top of the Eukarya containing two subclades (UTP4 & WDR61/SKI8), with a couple Archaeal sequences as well (though sequences from most Archaeal species are missing from this tree. The alignment (compressed, not full view) looks reasonable, but there is nothing convincing to propagate to the root, so I have treated the UTP4 and WDR61/SKI8 clades separately. ** UTP4 clade - One eukaryotic subclade is UTP4, a subunit of the ribosomal Small Subunit Processome, also called the SSU Processome, a large complex which is involved in the initial cleavages of the primary rRNA transcript to separate the small ribosomal subunit (SSU) rRNA from the remainder of the transcript and the biogenesis of the small ribosomal subunit. The SSU processome was originally identified and characterized from S. cerevisiae (Dragon et al. 2002, PMID:12068309; Gallagher et al. 2004, PMID:15489292; Bernstein et al. 2004, PMID:15590835; and reviewed in Phipps et al. 2011, PMID:21318072). As of September 2014, it has begun to be characterized experimentally from other species such as human (Turner et al. 2012, PMID:22418842; Sato et al. 2013, PMID:24219289; and Hu et al. 2011, PMID:21078665), zebrafish (Wilkins et al. 2013, PMID:24147052), and mouse (Gallenberger et al. 2011, PMID:21051332). The UTP4 subunit is a confirmed subunit of the SSU processome, and specifically part of the UtpA, aka tUTP, subcomplex (Phipps et al. 2011, PMID:21318072). Feng et al. 2013 (PMID:24214024) performed an extensive computational analysis from 77 completely sequenced eukaryotic genomes, including representatives of the five eukaryotic supergroups: Opisthokonts, Amoebozoa, Plantae, Excavates, and Chromalveolates, and compared these to sequences from both prokaryotic and Archaeal species for all 51 confirmed and 26 likely SSU processome subunits in S. cerevisiae as indicated in Phipps et al. 2011 (PMID:21318072). In addition, Srivastava et al. have identified SSU processome subunits in the parasitic protist Entamoeba histolytica (PMID:24631428). UTP4 is one of the 51 confirmed proteins of the S. cerevisiae SSU processome (Phipps et al. 2011, PMID:21318072)) and is highly conserved across the 77 eukaryotic species, as listed in Table 1 of Feng et al. 2013 (PMID:24214024). It is also found in the parasitic protist Entamoeba histolytica (Srivastava et al. 2014, PMID:24631428). ** WDR61/SKI8 clade - Human WDR61 is experimentally characterized as being in both the Cdc73/Paf1 complex, which is involved in histone methylation and regulation of transcriptional elongation by RNA polymerase II, and also in the Ski complex, which is involved in cytoplasmic mRNA degradation. Arabidopsis VIP3 is also experimentally characterized as a member of the Cdc72/Paf1 complex with a role in regulation of transcription. S. cerevisiae SKI8 is characterized as being a member of the cytoplasmic Ski complex. However, in cerevisiae, the Cdc73/Paf1 complex does not contain Ski8. It seems most conservative to assume that the ancestral sequence had both activities, and that the role of Ski8 in the Ski complex has been lost between the ancestor and cerevisiae. There is no experimental evidence to narrow down where this might have occurred, but based on the MSA, I will only block at the Saccharomycetales_PTN001108026 node. Comments on the tree: ------------------ - The UTP4 clade is missing the S. cerevisiae UTP4 sequence, as well as 8 other fungal sequences, which are incorrectly placed in PTHR22844:SF142 - SMALL NUCLEOLAR RNA-ASSOCIATED PROTEIN 4. - There is a small node of five fungal sequences separate from either the UTP4 clade or the WDR61/SKI8 clade. I've pruned it because it seems likely to be placed incorrectly even if it does belong in this tree. Additional annotation comments: --------------------------- There were experimental MF annotations for human CIRH1A to "poly(A) RNA binding" (GO:0044822) from two large scale studies: PMID:22681889 and PMID:22658674. Between the fact that these were high throughput experiments, the fact that this protein is normally part of a large complex where it is hard to determine which proteins bind RNA directly, and the fact that it is not clear that poly(A) RNA binding is biologically relevant, I have chosen not to propagate this MF term. Annotation inferences using phylogenetic trees The goal of the GO Reference Genome Project, described in PMID 19578431, is to provide accurate, complete and consistent GO annotations for all genes in twelve model organism genomes. To this end, GO curators are annotating evolutionary trees from the PANTHER database with GO terms describing molecular function, biological process and cellular component. GO terms based on experimental data from the scientific literature are used to annotate ancestral genes in the phylogenetic tree by sequence similarity (ISS), and unannotated descendants of these ancestral genes are inferred to have inherited these same GO annotations by descent. The annotations are done using a tool called PAINT (Phylogenetic Annotation and INference Tool).