Universal Protein Resource (UniProt) ==================================== The Universal Protein Resource (UniProt), a collaboration between the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics, and the Protein Information Resource (PIR), is comprised of three databases, each optimized for different uses. The UniProt Knowledgebase (UniProtKB) is the central access point for extensively curated protein information, including function, classification and cross-references. The UniProt Reference Clusters (UniRef) combine closely related sequences into a single record to speed up sequence similarity searches. The UniProt Archive (UniParc) is a comprehensive repository of all protein sequences, consisting only of unique identifiers and sequences. IDMAPPING ========= This directory, databases/uniprot/current_release/knowledgebase/idmapping/, contains the idmapping data files which are updated in conjunction with the UniProt Knowledgebase (UniProtKB). Whenever available the mappings are extracted from the UniProtKB records. All files listed below contain the complete data sets corresponding to the most recent release. 1) idmapping.dat This file has three columns, delimited by tab: 1. UniProtKB-AC 2. ID_type 3. ID where ID_type is the database name as appearing in UniProtKB cross-references, and as supported by the ID mapping tool on the UniProt web site, http://www.uniprot.org/mapping and where ID is the identifier in that cross-referenced database. 2) idmapping_selected.tab We also provide this tab-delimited table which includes the following mappings delimited by tab: 1. UniProtKB-AC 2. UniProtKB-ID 3. GeneID (EntrezGene) 4. RefSeq 5. GI 6. PDB 7. GO 8. UniRef100 9. UniRef90 10. UniRef50 11. UniParc 12. PIR 13. NCBI-taxon 14. MIM 15. UniGene 16. PubMed 17. EMBL 18. EMBL-CDS 19. Ensembl 20. Ensembl_TRS 21. Ensembl_PRO 22. Additional PubMed 3) example files idmapping_selected.tab.example has the first 1000 lines from idmapping_selected.tab idmapping.dat.example has the first 10,000 lines from idmapping.dat 4) We provide separate ID mapping tables for selected model organisms in subdirectory: by_organism 5) idmapping.dat.2015_03 and idmapping_selected.tab.2015_03 These are archived versions of the files idmapping.dat and idmapping_selected.tab, respectively, for release 2015_03. This was the last release before proteome redundancy reduction (http://www.uniprot.org/help/proteome_redundancy) which caused the size of UniProtKB/TrEMBL to drop from 92 million to 47 million entries. Users trying to map obsolete identifiers to external databases and vice versa may find these files useful. 6) README This file. The /complete/docs subdirectory contains various UniProt documents. -------------------------------------------------------------------------------- LICENSE -------------------------------------------------------------------------------- We have chosen to apply the Creative Commons Attribution 4.0 International (CC BY 4.0) License (https://creativecommons.org/licenses/by/4.0/) to all copyrightable parts of our databases. (c) 2002-2022 UniProt Consortium -------------------------------------------------------------------------------- DISCLAIMER -------------------------------------------------------------------------------- We make no warranties regarding the correctness of the data, and disclaim liability for damages resulting from its use. We cannot provide unrestricted permission regarding the use of the data, as some data may be covered by patents or other rights. Any medical or genetic information is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.