May 15, 2008

true names

i was up near toronto last week at the university of guelph, hanging out with the people from the biodiversity institute of ontario. paul hebert, who's the director of BIO, is one of the main proponents of DNA barcoding as a way of identifying organisms (major paper available here, for those so inclined). the project currently consuming the bulk of his time is the international barcode of life, an attempt to sample all (or most) species on the planet, sequence a reference portion of the DNA of those samples and construct a library of these reference DNA sequences. his facility at guelph has a group of passionate, intense staff, a workflow, a really nice bioinformatics platform, and partnerships in place to sequence hundreds of thousands of samples each year.

DNA barcoding and identification techniques like it represent a fundamental change in our orientation toward the world: we begin to move away from the imposition of presumed parameters and toward learning what might be called the true names of things. for the longest time, we've organised the world of organisms into groups, each composed of individuals that we observe to be more like each other than they are like other organisms -- taxonomists examine morphology, habits, and other characteristics, then divide up the world based on what these examinations reveal. these groups nest within each other (the hierarchy of groups is kingdom, phylum, class, order, family, genus, species), species being the lowest level of group organization. the binomial latin name (a naming system introduced by linnaeus two centuries ago) is usually how we refer precisely to an organism that is a member of a species. the first part indicates the genus, the second the species name; there are, for example, many types of maples (genus Acer) but Acer rubrum means you're talking about just the red or soft maple. it is the latter-day version of the naming of the animals (on which note, see this marvellous article).

the value of this exercise depends on there being at least one sequence of DNA in any major group of organisms that varies enough between species to be a robust identifier. for most eukaryotes, paul's operation sequences a short (648 basepair) section of cytochrome c oxidase subunit 1 (CO1) which is a part of the mitochondrial DNA -- the sequence then becomes that species's barcode. DNA is the group name that's written into the organism itself, and it identifies precisely -- the barcode as currently implemented is a short form of that true name that seems to identify relatively robustly for the species to which it has been applied.

the idea of a power-conferring true name is liberally sprinkled through the tradition of magic (i like best ursula le guin's earthsea, where magic is declarative and verbal and depends on systematic acquisition of the true names of the world. this is a particularly nice section from a wizard of earthsea). with a library of such reference barcodes, rapid, robust, and accurate identification of animals, bacteria, plants, viruses, and all manner of other things becomes possible just by sequencing the same part of the DNA of whatever sample you've collected, comparing that sequence to the reference library, and finding matches. there is obvious value in rapid, accurate identification. for example, it would revolutionise and dramatically increase the efficacy of programs for the control of vector-borne diseases; immediate applications can be found in natural resource management, water quality monitoring, climate change early warning, agriculture, and many other domains that concern us.

needless to say, the guild of academic biologists and taxonomists is debating the value and validity of this DNA barcoding exercise hotly.

Post a Comment