NAME

PfamClan - Indexes Pfam-C file (Clan).


SYNOPSIS

  use Index::PfamClan;


DESCRIPTION

  This package is used to index Pfam-C (Clan) files.
  It indexes the file with AC, ID, DE and MB keys (see below).
  You can add your own list of fileds to index by editing $regexp.
  AC  Accesion number of the Clan
  ID  Identifier (short name)
  DE  Description (Full name)
  MB  Pfam id within the clan.
  #AC   CL0002.7
  #ID   Kelch
  #DE   Kelch repeat
  #AU   Finn RD
  #RN   [1]
  #RM   10603472
  #RT   The kelch repeat superfamily of proteins: propellers of cell
  #RT   function. 
  #RA   Adams J, Kelso R, Cooley L; 
  #RL   Trends Cell Biol 2000;10:17-24.
  #RN   [2]
  #RM   2002850
  #RT   Novel thioether bond revealed by a 1.7 A crystal structure of
  #RT   galactose oxidase. 
  #RA   Ito N, Phillips SE, Stevens C, Ogel ZB, McPherson MJ, Keen JN, Yadav
  #RA   KD, Knowles PF; 
  #RL   Nature 1991;350:87-90.
  #DR   CATH; 2.130.10.80;
  #DR   SCOP; 50965;
  #CC   The kelch repeat is a ubiquitous element that typically occurs in
  #CC   four to seven repeats.  The sequence identity between individual
  #CC   kelch repeats is low.  However, there are eight key conserved
  #CC   residues, which include a distinctive double glycine.  The structure
  #CC   of a fungal galactose oxidase reveals a seven blade beta-propeller,
  #CC   with each blade  comprised of seven kelch repeats.  Each kelch repeat
  #CC   or blade forms a four strand beta sheet [1,2].   Kelch repeat
  #CC   containing proteins are involved in many aspects  of cell function,
  #CC   which include; actin-associated proteins, cell morphology and
  #CC   organisation, gene expression, viral binding  partners and
  #CC   extracellular roles [1].
  #MB   PF07646;
  #MB   PF01344;
  #//
  # examples here
  use Index::PfamClan;
  my($res, $msg, $index);
  ($res, $index) = new Index::PfamClan($file); #Pfam-C file to index
  die $index unless $res;
  #This input record delimeter is used when retrieving an entry from a file.
  ($res, $msg) = $index->setRecDel('dumper', '\n//');
  die($msg) unless($res);
  #This input record delimeter is used during the building of the index file. It reads the file line by line
  #and need a specific pattern to record the position in the file.
  ($res, $msg) = $index->setRecDel('building', '//');
  die($msg) unless($res);
  ($res, $r_inx) = $index->buildIndex(['ac', 'name']); #list of entries you want to index. $regexp in this module for allowing rules.
  die($r_inx) unless($res);                            #if no argument given, will build the index with all the rules describe in $regexp.
  #or you can build the index on your own key-value pairs based on regular expression. Only the first match is taken into account.
  #so if you want to index multiple things based on the same line, you need to create another key-value pair.
  ($res, $r_inx) = $index->builIndex({ 'id' => '>(\S+)', 'name' => '^\s+\w+\s+(\S+)' });
  die $r_inx unless $res;
  ($res, $mess) = $index->indexOut($r_inx);               #need a reference to hash table.
  die $mess unless $res;
  my $id = 'CL0001.10';
  my ($res, $pos) = $index->getIndex($id); #return the position in the file for this $id.
  die $pos unless $res;
  #getEntry returns a reference to an array in case of multiple entries found.
  my ($res, $entry) = $index->getEntry($id); #return the complete entry.
  die $entry unless $res;
  $entry = $entry->[0] if($entry);
  #Either (parsing once the whole entry)
  my($res, $msg) = $index->parseFields(\$entry);
  die $msg unless $res;
  my ($res, $name) = $index->getField('name'); #return the name of this entry.
  $name = $name->[0] if $name;
  my ($res, $name) = $index->getField(['ac', 'name', 'desc']); #return the ac, name and description of this entry.
  if($name){
      my $nm = $name->[0];
      my $ac = $name->[1];
      my $de = $name->[2];
  }
  .... # see below for fields you can retrieve.
  my ($res, $name) = $index->get_name(); #return the name of this entry.
  $name = $name->[0] if $name;
  #or simplier
  my ($res, $name) = $index->getField('name', \$entry); #return the name of this entry by parsing it on the fly
  $name = $name->[0] if $name;


VERSIONS

$Id: PfamClan.pm.html,v 1.1.1.1 2005/08/18 13:18:25 hunter Exp $

Copyright (c) European Bioinformatics Institute 2002


AUTHORS / ACKNOWLEDGEMENTS

Emmanuel Quevillon <tuco@ebi.ac.uk>

new

 Description:  Create a new object Index::PfamClan
 Arguments:    $file a file to index
               $tool Do you want to use Dispatcher::Tool to use index.conf values? (optional)
 Returns:      1, $self on success
               0, msg on failure

_init

       Description: Initialize record delimiter, file and parse configuration file from index configuration file.
       Argument:    
       Returns:     1, '' on success
                    0, msg on error