Universal Protein Resource (UniProt)
====================================


The Universal Protein Resource (UniProt), a collaboration between the European
Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics, and
the Protein Information Resource (PIR), is comprised of three databases, each
optimized for different uses. The UniProt Knowledgebase (UniProtKB) is the
central access point for extensively curated protein information, including
function, classification and cross-references. The UniProt Reference Clusters
(UniRef) combine closely related sequences into a single record to speed up
sequence similarity searches. The UniProt Archive (UniParc) is a comprehensive
repository of all protein sequences, consisting only of unique identifiers and
sequences.

The UniProt Knowledgebase (UniProtKB) has been created from Swiss-Prot, TrEMBL
and PIR-PSD. It consists of two parts, one containing fully manually annotated
records and another one with computationally analysed records awaiting full
manual annotation. The two sections will be referred to as the Swiss-Prot
section of the UniProt Knowledgebase (UniProtKB/Swiss-Prot) and TrEMBL section
of the UniProt Knowledgebase (UniProtKB/TrEMBL), respectively. PIR-PSD release
80.0 of 31-Dec-2004 has been fully integrated into these sections. This was the
last release of PIR-PSD.

This directory, databases/uniprot/current_release/knowledgebase/taxonomic_divisions,
contains the following files, updated every eight weeks:

reldate.txt
uniprot_sprot_archaea.dat.gz
uniprot_sprot_bacteria.dat.gz
uniprot_sprot_fungi.dat.gz
uniprot_sprot_human.dat.gz
uniprot_sprot_invertebrates.dat.gz
uniprot_sprot_mammals.dat.gz
uniprot_sprot_plants.dat.gz
uniprot_sprot_rodents.dat.gz
uniprot_sprot_vertebrates.dat.gz
uniprot_sprot_viruses.dat.gz
uniprot_trembl_archaea.dat.gz
uniprot_trembl_bacteria.dat.gz
uniprot_trembl_fungi.dat.gz
uniprot_trembl_human.dat.gz
uniprot_trembl_invertebrates.dat.gz
uniprot_trembl_mammals.dat.gz
uniprot_trembl_plants.dat.gz
uniprot_trembl_rodents.dat.gz
uniprot_trembl_unclassified.dat.gz
uniprot_trembl_vertebrates.dat.gz
uniprot_trembl_viruses.dat.gz

Entries are attributed to these files as follows: 
Every entry is present in exactly one file. 
*human.dat         contains all human entries
*mammals.dat       contains all mammalian entries except those from human and rodents
*vertebrates.dat   contains all vertebrate entries except those from mammals
*invertebrates.dat contains all eukaryotic entries except those from vertebrates, fungi and plants


In addition to these sets, based on nodes occurring in the Organism Classification
(OC) lines of UniProt Knowledgebase entries, several other data sets are available:

o the reference proteomes:
  ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/reference_proteomes/

o the pan proteomes:
  ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/pan_proteomes/

--------------------------------------------------------------------------
Note: 
The directory databases/uniprot/current_release/complete 
contains the eight-weekly updates of the UniProt Knowledgebase,
consisting of the Swiss-Prot Protein Knowledgebase 
and the TrEMBL Protein Sequence Database. 
Both, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL, are available separately in
flat file, XML and FASTA formats.

The /complete/docs subdirectory contains various UniProt documents.


--------------------------------------------------------------------------------
  LICENSE
--------------------------------------------------------------------------------
We have chosen to apply the Creative Commons Attribution 4.0 International
(CC BY 4.0) License (https://creativecommons.org/licenses/by/4.0/) to all
copyrightable parts of our databases.

(c) 2002-2022 UniProt Consortium

--------------------------------------------------------------------------------
  DISCLAIMER
--------------------------------------------------------------------------------
We make no warranties regarding the correctness of the data, and disclaim
liability for damages resulting from its use. We cannot provide unrestricted
permission regarding the use of the data, as some data may be covered by patents
or other rights.

Any medical or genetic information is provided for research, educational and
informational purposes only. It is not in any way intended to be used as a
substitute for professional medical advice, diagnosis, treatment or care.