Description | Run it | Instructions | Examples | Publications | Download | GRIMM

Some examples

by Guillaume Bourque

Herpesvirus data

Hannenhalli et al., 1995 used herpesvirus gene orders as a test case for one of the first studies on the Multiple Genome Rearrangement Problem. We used the gene order of Herpes simplex virus (HSV), Epstein-Barr virus (EBV) and Cytomegalovirus (CMV). This dataset consists of 3 genomes with 25 genes. Run demo.

Human, fruit fly, and sea urchin mtDNA data

Sankoff et al., 1996 analyzed human, sea urchin, and fruit fly mtDNA to derive the ancestral gene order. This dataset consists of 3 genomes with 33 markers. Run demo.

Metazoan mtDNA data

Blanchette et al., 1999 used BPAnalysis in the rearrangement study of 11 metazoan mtDNAs. The genomes come from 6 major metazoan groupings: nematodes (NEM), annelids (ANN), mollusks (MOL), arthropods (ART), echinoderms (ECH), and chordates (CHO). The data used was extracted from Jeffrey Boore's MGA Source Guide (originally from a version hosted at; that page is gone, so this link is to a copy hosted elsewhere). You can view the gene key here. This dataset consists of 11 genomes with 36 genes. Run demo.

Campanulaceae cpDNA data

This is one of the most challenging genome rearrangement datasets studied yet. It was studied by Cosner et al, 2000 using GRAPPA. It consists of 13 cpDNAs with 105 markers. Run demo.

Human-Cat-Mouse data

The raw data for the human, cat, mouse comparisons is in the Excel Spreadsheet LocusList101399.xls, provided by Bill Murphy, who was then at the National Cancer Institute (now at Texas A&M).

The data for this example is different than the previous ones on 2 levels. First, it is the first example dealing with multichromosomal genomes. Second, it was implicit that, in all other datasets, the orientation of the genes was known. Unfortunately, no information on the orientation of the genes was available for this dataset. Click here for the description of the genomes in the common alphabet. Since current algorithms dealing with unsigned permutations become too time-consuming, we had to infer an orientation to the genes using preserved strips (see Hannenhalli, et al. 1996). Ultimately, this should not be a problem as more data becomes available.

Using human genome as a reference, we first identified all the strips both in cat and in mouse genomes. We then assigned an orientation to the markers based on these strips. Any marker for which we could not assign an orientation using this method either in cat or in mouse genome was removed and we were left with a common set of 114 markers. Click here for the description of the genomes in this alphabet. This dataset consists of 3 genomes with 114 markers spread over multiple chromosomes. Run demo.