NAME

IprMatches - IprMatches index utilities.


SYNOPSIS

use Index::IprMatches;


DESCRIPTION

  This package is used to index match.xml file.
  It indexes the crc64 attribute and the id of the proteins (see below).
  It also creates a raw file if a crc64 of one of your sequence(s)
  matches the known crc64 in the match.xml file (which contain the known
  hit proteins in InterPro).
  # examples here
  use Index::IprMatches;
  my($res, $msg, $index);
  ($res, $index) = new Index::IprMatches($file); #xml file here for iprmatches
  die $index unless $res;
  ($res, $msg) = $index->setRecDel('dumper', '</protein>');
  die($msg) unless($res);
  ($res, $r_inx) = $index->buildIndex(['ac', 'name']); #list of entries you want to index. $regexp in this module for allowing rules.
  die($r_inx) unless($res);                            #if no argument given, will build the index with all the rules describe in $regexp.
  #or you can build the index on your own key-value pairs based on regular expression. Only the first match is taken into account.
  #so if you want to index multiple things based on the same line, you need to create another key-value pair.
  ($res, $r_inx) = $index->builIndex({ 'id' => '>(\S+)', 'name' => '^\s+\w+\s+(\S+)' });
  die $r_inx unless $res;
  ($res, $mess) = $index->indexOut($r_inx);               #need a reference to hash table.
  die $mess unless $res;
  my $id = 'RS16_ECOLI';
  my ($res, $pos) = $index->getIndex($id); #return the position in the file of the match.xml entry containing this $id.
  die $pos unless $res;
  #getEntry returns a reference to an array in case of multiple entries found.
  my ($res, $entry) = $index->getEntry($id); #return the complete entry.
  die $entry unless $res;
  $entry = $entry->[0] if($entry);
  #Either (parsing once the whole entry)
  my($res, $msg) = $index->parseFields(\$entry);
  die $msg unless $res;
  my ($res, $name) = $index->getField('name'); #return the name of this entry.
  $name = $name->[0] if $name;
  my ($res, $name) = $index->getField(['ac', 'name', 'desc']); #return the ac, name and description of this entry.
  if($name){
      my $nm = $name->[0];
      my $ac = $name->[1];
      my $de = $name->[2];
  }
  .... # see below for fields you can retrieve.
  my ($res, $name) = $index->get_name(); #return the name of this entry.
  $name = $name->[0] if $name;
  #or simplier
  my ($res, $name) = $index->getField('name', \$entry); #return the name of this entry by parsing it on the fly
  $name = $name->[0] if $name;
  #Specific function for this module.
  #==================================
  my $ofh = \*STDIN; #(File handle);
  my $appl = [qw(BlastProDom Coil Seg FingerPrintScan)];
  ($res, $msg) = $index->wrtie_raw($index->parse(\$entry, $appl), $seqid, $ofh, $ipr, $go);
  die $msg  unless $res;


VERSIONS

$Id: IprMatches.pm.html,v 1.1.1.1 2005/08/18 13:18:25 hunter Exp $

Copyright (c) European Bioinformatics Institute 2002


AUTHORS / ACKNOWLEDGEMENTS

Emmanuel Quevillon <tuco@ebi.ac.uk>

new

 Description:  Create a new object Index::IprMatches.
 Arguments:    $in a file to index
               $tool Do you want to use Dispatcher::Tool to use index.conf values? (optional)
 Returns:      1, $self on success
               0, msg on failure

_init

 Description: Initialize record delimiter, file and parse configuration file from index configuration file.
 Argument:    
 Returns:     1, '' on success
              0, msg on error

parse

 Description:  Parse the entry retrieved from the indexed file.
 Arguments:    $text, scalar reference to the complete entry.
               $r_appl, reference to an applications array.
               $r_taxo, reference to a taxonomy list array (optional) when user filters output with taxonomy.
 Returns:      reference of three hash tables.
               0, msg on failure

write_raw

 Description:  Creates raw file for entries retieved and parsed from match.xml
 Arguments:    $r_prot, reference to hash table containing protein name as key
               $r_ipr, reference to hash table containing interpro entries as key
               $r_matches, reference to hash table containing matches entries as key
               $seq_id, ID of the sequence returned this iprmatches entry.
               $fh, a reference to a file handle to write in.
               $ipr, iprlookup asked?
               $go, go terms asked?
 Returns:      1, '' on success.
               0, msg on failure