RELEASE.notes

		Compugen Gene Ontology Gene Association Data
				January 14, 2004

		  Compugen Flat File Release 0.6.1

                     Distribution Release Notes

        This document describes the format and content of flat files
that comprise public releases of the Compugen Gene Ontology Gene
Association data.  If you have any questions or comments about
Compugen or this document, please contact Compugen USA, Inc. via email
at GO@cgen.com or write to:

        Compugen USA, Inc.
        7 Center Drive, suite 9
        Jamesburg, NJ 08831
        Phone:  (609) 655-5105
        Fax:  (609) 655-5114

========================================================================== 
TABLE OF CONTENTS
==========================================================================

1. INTRODUCTION

1.1 Compugen USA, Inc. 
1.2 Release version 0.6.1 
1.3 Statistics

2. DATA AND METHODOLOGIES

2.1 Input Data
2.2 Brief Introduction of Methodologies

3. FILES

3.1 File Descriptions
3.2 File Format
3.3 Sample Gene Ontology Gene Association

4. GENERAL INFORMATION

4.1 Citing Compugen
4.2 Other Methods of Accessing Compugen's GO Gene Association Data
4.3 Request for Corrections and Comments
4.4 Disclaimer

==========================================================================

1. INTRODUCTION

1.1 Compugen 

	Compugen is a genomics-based drug and diagnostic discovery
company, whose mission is to increase the probability of successful
development of novel drug and diagnostic products by incorporating
ideas and methods from mathematics, computer science, and physics into
the disciplines of biology, organic chemistry, and medicine. This
unique capability results in powerful predictive models, which are
both advancing the understanding of important biological phenomena and
enabling the discovery of numerous potential therapeutic and
diagnostic products. Compugen has established collaborations with
leading pharmaceutical and diagnostic companies, and has begun
in-house development of selected putative therapeutic proteins that it
has discovered.  Compugen is publicly traded on Nasdaq (NASDAQ: CGEN)
and on the Tel Aviv Stock Exchange. We have corporate offices in
Israel, with a wholly owned subsidiary headquartered in New Jersey,
Compugen USA, Inc., and marketing and customer support presence in
California and Maryland.

For additional information, please visit Compugen's Corporate Web Site
at www.cgen.com.


1.2 Release version 0.6.1

        Compugen USA, Inc. is distributing the Gene Ontology Gene
Association Data files, as of January 14, 2004.  This release includes
three files as detailed below.

1.3 Statistics

        The current release includes Gene Ontology Gene Associations
to 231373 Uniprot Swiss-Prot and TrEMBL (Dec. 15, 2003 release)
protein entries and 540103 GenBank version 139.0 protein entries,
corresponding to 488851 unique proteins with a total of 2387153 GO
associations.  Please note that some of the proteins are listed in
both gene association files and some proteins and their annotations
may have more than one records due to their presence in different
species.

2. DATA AND METHODOLOGIES

2.1 DATA

        Compugen USA, Inc. Gene Ontology Association Data Release
version 0.6.1 was built based on data collected and extracted from the
following public databases, data and files.

        GenBank release 139.0
        UniProt release Dec. 15, 2003
        Medline databases as of April 6, 2001
        And the following files from Gene Ontology Consortium, downloaded on Oct. 22, 2003:
                gene_association.fb            
                gene_association.mgi            
                gene_association.sgd            
                gene_association.wb        
		gene_association.goa_sptr 

2.2 Brief Introduction of Methodologies

        Compugen USA, Inc. has developed proprietary method to
automatically annotate protein sequences using the controlled
vocabularies of Gene Ontology.  The annotation was centered on
homology comparison including protein profiles and ProlocTM Compugen
LTD. proprietary software for protein subcellular localization.  In
addition, information from definition lines of the sequence databases
and from Medline database were extracted to increase the accuracy of
annotation and to provide novel annotations.  Detailed description
about our methodologies can be found in the publication Genome
Research volume 12, page 785-794.

3. FILES

3.1 File Descriptions

	This release consists of three files. The following list
briefly describes each of the files included in the distribution,
along with their sizes.

1. RELEASE.notes - Release notes (this document).  
2. gene_association.Compugen_GenBank - gene association file for genes
from GenBank release 139.0, 161249720 bytes
3. gene_association.Compugen_UniProt - gene association file for
proteins from UniProt release on Dec. 15, 2003, 71301849 bytes

Both gene_association.Compugen_UniProt and
gene_association.Compugen_GenBank files exclude any Gene Ontology
association used as input data from the files listed in Section 2.1.
Please inquire GO@cgen.com for a combined gene association file.

  

3.2 File Format

        The file formats conform to the guidelines provided by the
Gene Ontology Consortium (www.geneontology.com, see also section 3.3).
All gene associations have the evidence code 'IEA' to indicate that
annotations released here are obtained through computational method.

3.3 Sample Gene Ontology Association Data

	  An example of a complete gene association is provided
here. (tab spaces are condensed here)

---------------------------------------------------------------------- 
CCGEN PrID69417 GI10 GO:0016538 CGEN:ProdVersion0.6.1 IEA F protein taxon:9913 20040107
---------------------------------------------------------------------- 
Legend: 
CGEN			Compugen database name 
PrID69417 		Unique ID assigned by Compugen 
GI10			GenBank GI number
			for UniProt proteins, UniProt accession number
GO:0016538 		GO number
CGEN:ProdVersion0.6.1   reference to Compugen internal production version
IEA                     evidence code
F                       ontology category
protein 		indicating a protein is annotated
Taxon:9913		Taxonomy ID number (see NCBI for more details)
20040107		indicate the date Jan. 7, 2004

4. GENERAL INFORMATION

4.1 Citing Compugen USA , Inc.

	When you use Compugen data in your research, please reference
Compugen USA, Inc. as Compugen USA, Inc. (http://www.cgen.com)

4.2 Other Methods of Accessing Compugen's GO Gene Association Data

	The data provided here, or updates if any, can also be
obtained through e-mail inquiry to GO@cgen.com.

4.3 Request for Corrections and Comments

	 We welcome your suggestions for improvements. Compugen's GO
gene association data is work in progress. Therefore, we are
especially interested in learning about errors or inconsistencies in
the data. Suggestions and corrections can be sent by e-mail to:
GO@cgen.com.


4.4 Disclaimer

The Compugen public GO Gene Association Data (the "Information") is
provided on an "as is" basis. Compugen expressly disclaims all implied
warranties, including without limitation any warranty of
non-infringement, and any warranty in respect of quality or fitness
for any particular use of the Compugen public GO Annotation.

Compugen Inc. does not warrant or assume any legal liability or
responsibility for the accuracy, completeness, or usefulness of any of
the Information, or process disclosed. Compugen expressly disclaims
any warranty that the Information will not be subject to patents which
already have been issued or to patents which may be issued in the
future, owned by any other entity including Compugen itself or
entities related to Compugen. The information is experimental in
nature, and is not approved by the U.S. Food and Drug administration
or any other regulatory body.  Compugen will not be liable for any
indirect, special, incidental, consequential or punitive damages or
any other damages based on economic harm, injury to property or lost
profits, regardless of whether Compugen has been advised.  Compugen
shall not be responsible for any damage caused, directly or
indirectly, in connection with your use of the Information.

Disclaimer of Endorsement 

Information

It is not the intention of Compugen Inc. to provide definitive
functional annotation for the genes and proteins described in this
data file, but rather to provide users with information to better
understand the functions of these genes and proteins and their
involvement in biological processes.

Copyright Status

Unless stated otherwise, the information may be freely downloaded and
reproduced. However, any publication or commercial use of the
Information must include acknowledgement of Compugen as the data
source. Please reference Compugen as: Compugen USA, Inc.:
http://www.cgen.com/.


Compugen USA, Inc. public GO Annotation Availability

The Compugen USA, Inc. public GO Annotation is designed to provide and
encourage access within the scientific community to an up to date and
comprehensive sequence annotation. Therefore, Compugen USA,
Inc. places no restrictions on the use or distribution of the Compugen
USA, Inc. public GO Annotation data.


January 14, 2004