Bio-Mirror Search Data | Documents | FTP Databanks

2010.November ... LFTP client problem advisory

2010.August ... provides mirroring of biology data sets. "GridFTP" means a new, faster data transfer method is in place, now in testing. Please join this trial. Current plans are to replace the current service with this newer, better one (retaining standard ftp access, but dropping rsync). Your comments are welcome. -- Don Gilbert


biosequence & bioinformatics data
This node:

This is a world bioinformatic public service for high-speed access to up-to-date DNA & protein biological sequence databanks.

Bio-mirror Nodes
New Zealand:

Bio-mirror Archive
Historical release archive, with copies of GenBank from 1990s as distributed on compact disc.
Software for biosequence data management and analysis.
Web services
Services at this site, including search and fetch DNA & protein sequences.

These databanks have been being growing so rapidly that distribution is hampered by existing Internet speeds.

The Bio-Mirror project provides high-speed access, with nighly updates, to these multi-Gigabyte data sets.

Project sites are connected with Internet2 infrastructure of vBNS, Abilene, TransPAC, and the Asia-Pacific Advanced Network (APAN). To learn more, see these bio-mirror documents.

Please e-mail comments or questions to

Databanks   (Current Status)
Local mirror Description Home site
BlastDB Biosequence databases for BLAST searches NCBI
Blocks Highly conserved regions of proteins NCBI
DDBJ DNA Data Bank of Japan NIG
EMBL The EMBL Nucleotide Sequence Database EBI
Ensembl Human Automatic annotation on eukaryotic genomes EnsEMBL
Ensembl Automatic annotation on eukaryotic genomes EnsEMBL
Enzyme Enzyme nomenclature database ExPASy
GenBank Genomes Whole genome sequence section of GenBank NCBI
GenBank GenBank Sequence Database NCBI
GeneOntology Vocabularies of gene functions and roles GeneOntology
InterPro InterPro Protein databank EBI
PDB Protein Data Bank of 3-D macromolecular structures RCSB
PIR Protein Information Resource NBRF
PIRNEW PIR updates from NBRF, Georgetown NBRF
Pfam The Pfam database of protein domains and HMMs WUSTL
Prosite Database of protein families and domains ExPASy
Rebase The Restriction Enzyme Database NEB
RefSeq NCBI Reference Sequences NCBI
SRS Databanks List of active SRS databases around world EBI
SWISS-PROT Annotated protein sequence database ExPASy
Taxonomy Species names NCBI, EBI
TrEMBL A supplement to SWISS-PROT EBI
UniGene Unique Gene Sequence Collection for Human, Mouse, Rat, and Zebrafish NCBI
euGenes Eukaryote Genes Summary Databank IUBio

* Commercial use restrictions on SWISS-PROT and PROSITE: Publishers of the SWISS-PROT and PROSITE data sets ask that all commercial users participate in the funding of these important data by paying a license fee. No fee is charged to academic users.

LFTP client advisory:

The client "lftp" is causing serious problems at It is misbehaving and overloading the bio-mirror server. This may be a matter of bad configuration, or something in the client code. It is often running many more simultaneous processes than the server can support for a shared resource. Often each lftp process is using up much more cpu than any other ftp client. Since early 2010 several customers have stared using this client, and it has caused problems in all cases. I recommend you use instead the GridFTP client, which is well-designed for rapid long-distance ftp. If not that, other well designed ftp clients are widely available and can be used, with up to 10 simultaneous connections to However, to preserve this shared resource we will block or limit access by misbehaving clients such as lftp.

Developed at the Genome Informatics Lab
of Indiana University Biology Department

The Bio-Mirror project receives support from organizations including
APBioNet, APAN | AFFRC, Japan | AARNet, Australia | NUS, Singapore | IMCAS, China | KAIST, Korea | Kasetsart University, Thailand | NSF & Indiana University, USA