Report of the Workshop on Integration of Microbial Databases




CME is interested in any comments you may have about the contents of this report.
Please contact niels@vitro.cme.msu.edu with any questions or comments you may have.

Table of Contents

1.0 Introduction

2.0 Goals

3.0 IMD Prototypes Demonstration

4.0 Recommended Activities

4.1 Organization and Administration

4.2 System Design and Implementation

The workshop did not settle on detailed recommendations of which database technology to choose. This will be decided based on further development of prototype(s). The workshop recognized that for an integration to be successful, contributing members should neither be restricted in how they curate their own data, be bound to certain release dates, or necessarily be forced to conform to a new data organization chosen by the coordinating center. Such requirements would only discourage many databases from joining.

It was recommended, however, that IMD members adhere to the following minimal requirements when submitting data: 1) be able to upload 100 % consistently formatted ASCII versions of their data (except of course for graphics), 2) provide a clear description of what the data are and how they relate, either in a concise English form, or using a formal notation. It was recognized that an effective meta data system is an important component of an integrated database. The suitability of current meta-data curation systems needs to be evaluated.

Most data relate to organisms, therefore a consistent organism description and nomenclature is required to connect the data. It was decided to initiate an international collaborative effort to curate a single, comprehensive, prokaryoticnomenclatural database, with synonyms and name histories (via the internet) to beused to interconnect data from participating databases (further discussion below).

An integrated microbial database should be structured around an up-to-date phylogeny and/or taxonomy. This will provide a natural framework for selecting input, and for viewing results. Examples of queries brought up during the workshop include "What are the evolutionary relationships among organisms that are capableof nitrogen fixation, by what pathways is nitrogen fixed, and are there any FAME signatures common for these groups of organisms?". Retrieval of the answers would require information from several contributing databases which would be displayed in a phylogenetic format. Results would also contain links to further information. For example, linking to the culture collection databases would permit a user to inquire further about the availability of a particular culture of nitrogen-fixing bacteria as well as providing additional information about nitrogen-fixing bacteria that are held in culture collections.

For data which do not easily render themselves phylogenetically (e.g. gene locations, metabolic pathways), existing methods for viewing the data must be easy to connect. Generally, an open model should be sought where any type of microbial data or software could be included.

The participants finally agreed that, ideally, anyone with World Wide Web access should be able to navigate through and query all data easily and effectively.

4.3 Data to be Integrated 4.3.1 First Priority Data a. Nomenclatural Database
b. Phylogenetic Trees
c. Phenotypic Data

4.3.2 Existing Databases

a. Publicly Accessible On-Line Databases
b. Independently Curated, Specific Databases

4.3.3 Databases Needing Development

a. ARDRA
b. Habitat
c. Databases Obtained with Commercial Test Kits or Systems
d. Images

4.3.4 Other Groups Which Have Microbial Strain Data

5.0 Federation Membership and Responsibilities

6.0 Workshop Participants

7.0 Summary


More information about the integrated database project is available in insights, the CME Newsletter.


Return to CME Publications.