Description | Run it | Instructions | Examples | Publications | Download | GRIMM
by Guillaume Bourque
Hannenhalli et al., 1995 used herpesvirus gene orders as a
test case for one of the first studies on the Multiple Genome
Rearrangement Problem. We used the gene order of Herpes simplex virus (HSV),
Epstein-Barr virus (EBV) and Cytomegalovirus (CMV). This dataset consists
of 3 genomes with 25 genes. Run demo.
Human, fruit fly, and sea urchin mtDNA data
Sankoff et al., 1996 analyzed human, sea urchin, and fruit fly mtDNA to derive the ancestral gene order.
This dataset consists of 3 genomes with 33 markers. Run demo.
Metazoan mtDNA data
Blanchette et al., 1999 used BPAnalysis in the rearrangement study of 11 metazoan mtDNAs. The genomes
come from 6 major metazoan groupings: nematodes (NEM), annelids (ANN),
mollusks (MOL), arthropods (ART), echinoderms (ECH), and chordates (CHO).
The data used was extracted from Jeffrey Boore's MGA Source Guide (originally from a version hosted at jgi.doe.gov; that page is gone, so this link is to a copy hosted elsewhere).
You can view the gene key here. This dataset consists of 11 genomes with 36 genes. Run demo.
Campanulaceae cpDNA data
This is one of the most challenging genome rearrangement datasets studied yet.
It was studied by Cosner et al, 2000 using GRAPPA. It consists of 13 cpDNAs with 105 markers. Run demo.
The raw data for the human, cat, mouse comparisons is in the Excel Spreadsheet LocusList101399.xls,
provided by Bill Murphy, who was then at the National Cancer Institute (now at Texas A&M).
The data for this example is different than the previous ones on 2 levels.
First, it is the first example dealing with multichromosomal genomes.
Second, it was implicit that, in all other datasets, the orientation of the genes was known.
Unfortunately, no information on the orientation of the genes was available for
this dataset. Click here for the description of the genomes in the common alphabet. Since current algorithms dealing with unsigned permutations become too
time-consuming, we had to infer an orientation to the genes using preserved strips
(see Hannenhalli, et al. 1996).
Ultimately, this should not be a problem as more data becomes available.
Using human genome as a reference, we first identified all the strips
both in cat and in mouse genomes. We then assigned an orientation
to the markers based on these strips. Any marker for which we
could not assign an orientation using this method either in cat or
in mouse genome was removed and we were left with a common set of
114 markers. Click here for the description of the genomes in this alphabet. This dataset consists of 3 genomes with 114 markers
spread over multiple chromosomes. Run demo.