About Viroverse

Interested in learning more about Viroverse? Read below, subscribe to the annoucements mailing list, or send inquiries to the viroverse team.

Contents

History

Dr. Mullins (project PI) has long been interested in a structured system to store the data generated and used in his lab. Evaluating commercial systems, he found that none met the needs of investigators looking to include experimental data for molecular biology togther with the results and analysis. Thus began, Viroverse.

The first step in building Viroverse was to develop a comprehensive database schema, storing viral sequences, subject data and additional information such as sequence annotations and alignments. Using the SeaPIP and MACS cohorts as prototypes of robust data, the Viroverse design team chose a level of abstraction in database representation that allows future values to be added easily and accommodates quality controls by flagging unexpected values. Experience adapting existing data from these two large pilot cohorts revealed the complexity of integrating even basic data from multiple sources. All values are stored in the most exact format possible so that precision is not sacrificed to the lowest common format.

We chose a highly normalized relational database structure specific to the molecular biology and attendant data of viral pathogens that allows us to enforce a degree of data standardization at the database level by using lookup tables of controlled vocabulary for reusable values and foreign key referential integrity. The dimensional approach typically favored in data warehousing architectures uses a relatively small number of fact tables to accommodate a wide variety of data. Using individual tables to represent the product of each process also allows association with unique pieces of data for each object in addition to generalized annotations that may apply across object classes.

Publications

The following publications describe Viroverse components or functionality:

  1. Deng, W., D. C. Nickle, G. H. Learn, B. Maust, and J. I. Mullins. 2007. ViroBLAST: a stand-alone BLAST web server for flexible queries of multiple databases and user's datasets. Bioinformatics 23:2334-2336.

Software Development

The web environment of Viroverse uses the Catalyst Model/View/Controller framework, served by an Apache mod_perl server and connected to a PostgreSQL database. A set of abstract classes models the structures of the database, views transform those classes into HTML or other structured representations, and controllers collect the data from various models to supply the view according to user requests. Changes made by developers are managed via a CVS repository.

People

The following groups form the primary development team, bringing expertise in molecular virology, computational biology, microarray analysis, and software design: