bio-mirror data archive at iubio.bio.indiana.edu
december 1998, don gilbert

  ftp iubio.bio.indiana.edu, user: iubio, password: iubio
  or ftp://iubio:iubio@iubio.bio.indiana.edu/data-archive/
  
This data archive is linked thru Indiana University to the vBNS, 
very high performance Backbone Network Service, as well as APAN 
and TransPAC, which will provide like-connected computer systems
with high speed access to this common biosequence data.

This a public ftp location, but not standard anonymous-ftp, to keep this 
large data from being picked up by those hosts who mirror 
the current anon-ftp section.
 
size(Kb)   current data sections
------   -----------------------
1       ./docs      -- documents and mirror.pl mirroring software
1994170 ./embl      -- embl release and daily cumulative update from ebi
2252563 ./genbank   -- genbank release and daily cumulative update from ncbi
81889   ./pir       -- pir release in codata (ascii) format from ncbi
1       ./pdb       -- nothing yet (for brookhaven pdb data set)
54298   ./swissprot -- swissprot release from expasy
93161   ./trembl    -- trembl non-redundant release from expasy

Data files are unix compressed (.Z).  Currently only the primary
sequence data is mirrored; accessory indices and other parts are not.
These are now on a daily mirror schedule run at 05:00 GMT

One can learn more about vBNS, including which systems are connected,
at http://www.vbns.net/.
For networking in Asian Pacific regions, one can learn more of TransPAC
at http://www.transpac.org/ 


===================================
bio data FTP mirror settings
===================================
Packages used at IUBio for standard mirror.pl FTP mirroring
these reflect my own selections of data, which may not be what
what others want.  Let me know of suggested changes or additions
to these file selections or organization of the data folders.  
I currently have some limits on disk space, which accounts for 
some of the file selections, but have more disk on order. These are 
now on a daily mirror schedule run at 05:00 GMT

==> ncbi <==

package=genbank
        comment=Genbank from NCBI
        site=ncbi.nlm.nih.gov
        remote_dir=ncbi-genbank
        local_dir=/c3/pub/data-archive/genbank
        recursive=false
        get_patt=(\.seq\.Z$|^gbrel\.txt|README|release\.notes)
        exclude_patt=(^genomes|^vms|^daily)
        local_ignore=(^daily|^genomes|local$)

package=gbdaily
        comment=Daily updates, Genbank from NCBI
        site=ncbi.nlm.nih.gov
        remote_dir=ncbi-genbank/daily
        local_dir=/c3/pub/data-archive/genbank/daily
        recursive=false
        get_patt=(^gbcu\.flat\.Z$|README)
        local_ignore=(local$)

# gbgenome not yet mirrored
package=gbgenome
        comment=Genome section, Genbank from NCBI
        site=ncbi.nlm.nih.gov
        remote_dir=ncbi-genbank/genomes
        local_dir=/c3/pub/data-archive/genbank/genomes
        get_patt=(\.tar\.Z$|README)
        local_ignore=(local$)

package=pir
        comment=PIR from NCBI
        site=ncbi.nlm.nih.gov
        remote_dir=repository/PIR/ascii
        local_dir=/c3/pub/data-archive/pir
        recursive=false
        get_patt=(\.dat\.Z$|README)
        local_ignore=(local$)

==> ebi <==

package=embl
        comment=EMBL from EBI
        site=ftp.ebi.ac.uk
        remote_dir=pub/databases/embl/release
        local_dir=/c3/pub/data-archive/embl/release
        get_patt=(\.dat\.Z$|^Release|README|\.doc$)
        recursive=false
        exclude_patt=(^gcg|^emvec)
        local_ignore=(local$)

package=emdaily
        comment=EMBL new from EBI
        site=ftp.ebi.ac.uk
        remote_dir=pub/databases/embl/new
        local_dir=/c3/pub/data-archive/embl/new
        recursive=false
        get_patt=(^cumulative\.dat\.Z$||README)
        exclude_patt=(^gcg|^list)
        local_ignore=(local$)

==> expasy <==

package=swissprot
        comment=SwissProt main data from Expasy
        site=expasy.hcuge.ch
        remote_dir=databases/swiss-prot/compressed
        local_dir=/c3/pub/data-archive/swissprot
        recursive=false
        get_patt=(README|sprot.*\.dat\.Z$|userman\.txt\.Z|relnotes\.txt\.Z)
        local_ignore=(updates|local$)

package=swissnew
        comment=SwissProt new data from Expasy
        site=expasy.hcuge.ch
        remote_dir=databases/swiss-prot/updates
        local_dir=/c3/pub/data-archive/swissprot/updates
        recursive=false
        get_patt=(new_seq\.dat$)
        local_ignore=(local$)

package=trembl
        comment=SwissProt main data from Expasy
        site=expasy.hcuge.ch
        remote_dir=databases/sp_tr_nrdb
        local_dir=/c3/pub/data-archive/trembl
        recursive=false
        get_patt=(README|trembl.*\.dat\.Z$|userman.txt.Z|relnotes.txt.Z)
        local_ignore=(local$)


===================================
Some network connection information
===================================

iubio a 100mbs connection with the IU campus network

traceroute to disc.dna.affrc.go.jp (150.26.230.101), 30 hops max, 40 byte packets
 1  hper1-gw224.ucs.indiana.edu (129.79.224.254)  2 ms  1 ms  1 ms
 2  iub-gw.ucs.indiana.edu (129.79.5.10)  1 ms  1 ms  1 ms
 3  hper3-gw.ucs.indiana.edu (129.79.5.77)  1 ms  1 ms  1 ms
 4  10.88.194.2 (10.88.194.2)  3 ms  3 ms  3 ms
 5  192.12.206.1 (192.12.206.1)  16 ms  12 ms  12 ms
 6  tpr-startap.jp.apan.net (203.181.248.237)  14 ms  13 ms  14 ms
    ..... good speed to here .....
 7  tpr-atm3-0-7.jp.apan.net (203.181.248.242)  205 ms  204 ms  204 ms
    .....it gets slower here....
 8  tppr-atm2-0-3.jp.apan.net (203.181.248.233)  226 ms  227 ms  226 ms
 9  im-tyx-01-fddi1-0.inoc.imnet.ad.jp (202.241.2.39)  228 ms  227 ms  227 ms
10  im-tyx-52-fddi1-0.inoc.imnet.ad.jp (202.241.2.84)  233 ms  231 ms  228 ms
11  im-tyc-01-ATM4-0.cnoc.imnet.ad.jp (202.241.0.9)  230 ms  228 ms  229 ms
12  im-tyc-05-fddi0-0.cnoc.imnet.ad.jp (202.241.1.194)  231 ms  231 ms  230 ms
13  im-tbc-02-ATM2-0.enoc.imnet.ad.jp (202.241.0.26)  238 ms  235 ms  237 ms
14  im-tbc-01-fddi4-0.enoc.imnet.ad.jp (202.241.1.161)  240 ms  237 ms  237 ms
15  202.241.0.130 (202.241.0.130)  246 ms  242 ms *
16  affgw2.Tsukuba-Noc.maffin.ad.jp (150.26.252.1)  218 ms  222 ms  222 ms
17  NIS600.cc.affrc.go.jp (150.26.1.64)  224 ms  230 ms  230 ms
18  disc.dna.affrc.go.jp (150.26.230.101)  227 ms *  225 ms


traceroute to ncbi.nlm.nih.gov (130.14.25.1), 30 hops max, 40 byte packets
 1  hper1-gw224.ucs.indiana.edu (129.79.224.254)  2 ms  1 ms  1 ms
 2  iub-gw.ucs.indiana.edu (129.79.5.10)  1 ms  1 ms  1 ms
 3  PVC2.INT.GW1.IND1.ALTER.NET (157.130.101.193)  4 ms  4 ms  6 ms
 4  * 121.ATM2-0.XR2.CHI6.ALTER.NET (146.188.208.174)  9 ms  8 ms
 5  290.ATM2-0.TR2.CHI4.ALTER.NET (146.188.209.10)  8 ms  8 ms  8 ms
 6  106.ATM7-0.TR2.DCA1.ALTER.NET (146.188.136.110)  37 ms  36 ms  39 ms
 7  198.ATM9-0-0.XR2.TCO1.ALTER.NET (146.188.161.189)  38 ms  38 ms  36 ms
 8  192.ATM4-0-0.BR1.TCO1.ALTER.NET (146.188.160.69)  37 ms  37 ms  37 ms
 9  137.39.23.14 (137.39.23.14)  37 ms  37 ms  37 ms
10  p1-0.vienna1-nbr3.bbnplanet.net (4.0.5.46)  41 ms  37 ms  36 ms
11  p8-0-0.washdc1-br2.bbnplanet.net (4.0.1.89)  38 ms  38 ms  38 ms
12  f1-0.washdc1-cr3.bbnplanet.net (4.0.36.23)  38 ms  42 ms  38 ms
13  h4-0.nlm2.bbnplanet.net (4.0.146.162)  40 ms  40 ms  42 ms
14  130.14.15.4 (130.14.15.4)  40 ms  39 ms  41 ms
15  130.14.90.186 (130.14.90.186)  39 ms  39 ms  40 ms
16  ncbi.nlm.nih.gov (130.14.25.1)  41 ms *  42 ms


traceroute to gin.ebi.ac.uk (193.62.196.129), 30 hops max, 40 byte packets
 1  hper1-gw224.ucs.indiana.edu (129.79.224.254)  2 ms  1 ms  1 ms
 2  iub-gw.ucs.indiana.edu (129.79.5.10)  1 ms  1 ms  1 ms
 3  PVC2.INT.GW1.IND1.ALTER.NET (157.130.101.193)  5 ms  10 ms  6 ms
 4  121.ATM2-0.XR2.CHI6.ALTER.NET (146.188.208.174)  9 ms  9 ms  7 ms
 5  190.ATM8-0-0.GW1.CHI6.ALTER.NET (146.188.208.69)  11 ms  9 ms  10 ms
 6  teleglobe-chi1.customer.alter.net (157.130.99.6)  9 ms  10 ms  8 ms
 7  207.45.223.30 (207.45.223.30)  31 ms  29 ms  28 ms
 8  207.45.223.158 (207.45.223.158)  27 ms  25 ms  26 ms
 9  207.45.199.234 (207.45.199.234)  50 ms  44 ms  44 ms
10  207.45.215.1 (207.45.215.1)  45 ms  48 ms  44 ms
11  cust-gw.Teleglobe.net (207.45.215.166)  44 ms  47 ms  46 ms
12  193.62.157.9 (193.62.157.9)  121 ms  125 ms  123 ms
13  external-gw.ja.net (193.63.94.40)  125 ms  126 ms  126 ms
14  london-core.ja.net (146.97.251.58)  124 ms  123 ms  132 ms
15  cam-pop.ja.net (146.97.251.54)  129 ms  128 ms  127 ms
16  gw.hinx.ja.net (193.60.0.6)  134 ms  144 ms  135 ms
17  194.66.91.18 (194.66.91.18)  129 ms  131 ms  204 ms
18  gin.ebi.ac.uk (193.62.196.129)  134 ms  135 ms  128 ms


traceroute to expasy.hcuge.ch (129.195.254.61), 30 hops max, 40 byte packets
 1  hper1-gw224.ucs.indiana.edu (129.79.224.254)  2 ms  1 ms  1 ms
 2  iub-gw.ucs.indiana.edu (129.79.5.10)  1 ms  1 ms  1 ms
 3  PVC2.INT.GW1.IND1.ALTER.NET (157.130.101.193)  15 ms  15 ms  19 ms
 4  121.ATM2-0.XR2.CHI4.ALTER.NET (146.188.208.166)  23 ms  15 ms  22 ms
 5  194.ATM0-0-0.BR1.CHI1.ALTER.NET (146.188.208.5)  15 ms  29 ms  23 ms
 6  137.39.23.34 (137.39.23.34)  18 ms *  11 ms
 7  f3-1.t24-0.Chicago.t3.ans.net (140.222.27.122)  11 ms  13 ms  11 ms
 8  h11-1.t40-0.Cleveland.t3.ans.net (140.223.25.22)  25 ms  35 ms  41 ms
 9  h12-1.t36-0.New-York2.t3.ans.net (140.223.37.9)  53 ms  37 ms  49 ms
10  f0-0.c36-11.New-York2.t3.ans.net (140.223.36.222)  53 ms  55 ms *
11  h0-0.enss3235.t3.ans.net (204.151.184.214)  164 ms  159 ms  153 ms
12  swiEZ1-F1-0-0.switch.ch (130.59.20.206)  152 ms *  167 ms
13  swiCE1-A0-0-2.switch.ch (130.59.230.18)  153 ms  152 ms  165 ms
14  swiGE1-A6-0-1.switch.ch (130.59.230.2)  142 ms  156 ms  175 ms
15  dg-hcuge.unige.ch (192.33.214.8)  143 ms  153 ms  157 ms
16  expasy.expasy.ch (129.195.254.61)  147 ms *  155 ms