Pages

.

Showing posts with label omics. Show all posts
Showing posts with label omics. Show all posts

Do-It-Yourself Phylogenetic Trees

I've been doing a lot of desktop science lately, and I'm happy to report that superb, easy-to-use online tools exist for creating your own phylogenetic trees based on gene similarities, something that's non-trivial to implement yourself.

The other day, I speculated that the fruit-fly Ogg1 gene, which encodes an enzyme designed to repair oxidatively damaged guanine residues in DNA, might derive from Archaea. The Archaea (in case you're not a microbiologist) comprise one of three super-kingdoms in the tree of life. Basically, all life on earth can be classified as either Archaeal, Eukaryotic, or Eubacterial. The Eubacteria are "true bacteria": they're what you and I think of when we think "bacteria." (So, think Staphylococcus and tetanus bacteria and E. coli and all the rest.) The Eukaryota are higher life forms, starting with yeast and fungi and algae and plankton, progressing up through grass and corn and pine trees, worms and rabbits and donkeys, all the way to the highest life form of all, Stephen Colbert. (A little joke there.) Eukaryotes have big, complex cells with a distinct nucleus, complex organelles (like mitochondria and chloroplasts), and a huge amount of DNA packaged into pairs of chromosomes.

Archaea look a lot like bacteria (they're tiny and lack a distinct nucleus, organelles, etc.), and were in fact considered bacteria until recently. But around the turn of the 21st century, Carl Woese and George E. Fox provided persuasive evidence that members of this group of organisms were so different in genetic profile (not to mention lifestyle) that they deserved their own taxonomic domain. Thus, we now recognize certain bacteria-like creatures as Archaea.

The technical considerations behind the distinction between bacteria and archeons are rather deep and have to do with codon usage patterns, ribosomal RNA structure, cell-wall details, lipid metabolism, and other esoterica, but one distinguishing feature of archeons that's easy to understand is their willingness to live under harsh conditions. Archaeal species tend to be what we call extremophiles: They usually (not always) take up residence in places that are incredibly salty, or incredibly hot, or incredibly alkaline or acidic.

While it's generally agreed that eukaryotes arose after Archaea and bacteria appeared, it's by no means clear whether Archaea and bacteria branched off independently from a common ancestor, or perhaps one arose from the other. (A popular theory right now is that Archaea arose from gram-positive bacteria and sought refuge in inhospitable habitats to escape the chemical-warfare tactics of the gram-positives.) A complication that makes studying this sort of thing harder is the fact that horizontal gene transfer has been known to happen (with surprising frequency, actually) across domains.

Is it possible to study phylogenetic relationships, yourself, on the desktop? Of course. One way to do it: Obtain the DNA sequences of a given gene as produced by a variety of organisms, then feed those gene sequences to a tool like the tree-making tool at http://www.phylogeny.fr. Voila! Instant phylogeny.

The Ogg1 gene is an interesting case, because although the DNA-repair enzyme encoded by this gene occurs in a wide variety of higher life forms, plus Archaea, it is not widespread among bacteria. Aside from a couple of Spirochaetes and one Bacteroides species, the only bacteria that have this particular gene are the members of class Clostridia (which are all strict anaerobes). Question: Did the Clostridia get this gene from anaerobic Archaea?

Using the excellent online CoGeBlast tool, I was able to build a list of organisms that have Ogg1 and obtain the relevant gene sequences, all with literally just a few mouse clicks. Once you run a search using CoGeBlast, you can check the checkboxes next to organisms in the results list, then select "Phylogenetics" from the dropdown menu at the bottom of the results list. (See screenshot.)


When you click the Go button, a new FastaView window will open up, containing the gene sequences of all the items whose checkboxes you checked in CoGeBlast. At the bottom of this FastaView window, there's a small box that looks like this:


Click Phylogeny.fr button (red arrow). Immediately, your sequences are sent to the French server where they'll be converted to a phylogenetic tree in a matter of one to two minutes (usually). The result is a tree that looks something like this:


I've color-coded this tree to make the results easier to interpret. Creating a tree of this kind is not without potential pitfalls, because for one thing, if your DNA sequences are of vastly unequal lengths, the groupings made by Phylogeny.fr are likely to reflect gene lengths more than true phylogeny. For this tree, I did various data checks to make sure we're comparing apples and apples. Even so, a sanity check is in order. Do the groupings make sense? They do, actually. At the very top of the diagram (color-coded in green) we find all the eukaryotes grouped together: fruit-fly (Drosophila), yeast (Saccharomyces), fungus (Aspergillus). At the bottom of the diagram, Clostridium species (purplish red) fall into a subtree of their own, next to a tiny subtree of Methoanobrevibacter. This actually makes a good deal of sense, because the two Methanobrevibacter species shown are inhabitants of feces, as are the nearby Clostridium bartletti and C. diff. The fact that all the salt-loving Archaea members group together (organisms with names starting with 'H') is also indicative of a sound grouping. Overall, the tree looks sound.

If you're wondering what all the numbers are, the scale bar at the bottom (0.4) shows the approximate percentage difference in DNA sequences associated with that particular length of tree depth. The red numbers on the tree branches are indicative of the probability that the immediately underlying nodes are related. Probably the most important thing to know is that the evolutionary distance between any two leaves in the tree is proportional to the sums of the branch lengths connecting them. (The branch lengths are not explicitly specified; you have to eyeball it.) At the top of the diagram, you can see that the branch lengths of the two Drosophila instances are very short. This means they're closely related. By contrast, the branch lengths for Saccharomyces and the ancestor to Drosophila are long, meaning that these organisms are distantly related.

Just to give you an idea of the relatedness, I checked the C. botulinum Ogg1 protein amino-acid sequence against C. tetani, and found 63% identity of amino acids. When I compared C. botulinum's enzyme against C. difficile's, there was 52% identity. With Drosophila there is only 32% identity, and even that applies only to a 46% coverage area (versus 90%+ for C. tetani and C. diff). Bottom line, the Blast-wise relatedness does appear to correspond, in sound fashion, to tree-wise relatedness.

Two things stand out. One is that not all of the Clostridium species group together. (There's a small cluster of Clostridia near the salt-lovers, then a main branch near the methane-producing Archaea. The out-group of Clostridia near the salt-lovers happen to all have chromosomal G+C content of 50% or more, which makes them quite different from the rest of the Clositridia, whose G+C is under 30%.) The other thing that stands out is that it does appear as if Clostridial Ogg1 could be Archaeal in origin, based on the relationship of Methanoplanus and Methanobrevibacter to the main group of Clostridia. (Also, the C. leptum group's Ogg1 may share an ancestor with the halophilic Archaea.) One thing we can say for sure is that Ogg1 is ancient.

It's tempting to speculate that the eukaryotes obtained Ogg1 from early mitochondria, and that early mitochondria were actually Archaeal endosymbionts. The first part is easily true, because we know that early mitochondria quickly exported most of their DNA to the host nucleus. (Today's mitochondrial DNA is vestigial. Well over 90% of mitochondrial genes are actually in the host nucleus. Things like mitochondrial DNA polymerase have to be transcribed from nucleus-generated RNA.) Whether or not early mitochondria were Archaeal endosymbionts, no one knows.

Anyway, I hope this shows how easy it is to generate phylogenetic trees from the comfort of a living room sofa, using nothing more than a laptop with wireless internet connection. Try making your own phylo-trees using CoGeBlast and Phylogeny.fr—and let me know what you find out.
reade more... Résuméabuiyad

A Bioinformatics Bookmarklet

Sometimes you want to scrape some screen data and analyze it on the spot without copying it to another program. It turns out there's an easy way to do just that. Just highlight the information (by click-dragging the mouse to Select a section of screen data), then run a piece of JavaScript against the selection.

Example: I do a lot of peeking and poking at DNA sequences on the web. Often, I'm interested in knowing various summary statistics for the DNA I'm looking at. For example, I might see a long sequence that looks like AGTTAGAAAACCTCAGCTACTAG (etc.) and wonder what the G+C content of that stream is. So I'll select the text by click-dragging across it. Then I'll obtain the text in JavaScript by calling getSelection().toString(). Then I parse the text and display the results in an alert dialog.

Suppose I've selected a run of DNA on-screen and I want to know the base content (the amounts of G, C, T, and A).


text = getSelection().toString(); // get the data as a string
text = text.toUpperCase(); // optionally convert it to upper case

bases = new Object;  // create a place to store the base counts
bases.G = bases.C = bases.T = bases.A = 0; // initialize

// now loop over the string contents:
for (var i = 0; i < text.length; i++)
bases[ text[i] ]++; // bump the count for that base
 
// format the data for viewing
msg = "G: " + bases.G/text.length + "\n";
msg += "C: " + bases.C/text.length + "\n";
msg += "A: " + bases.A/text.length + "\n";
msg += "T: " + bases.T/text.length + "\n";
msg += "GC Content: " + (bases.G + bases.C)/text.length; 
 
// view it:
alert( msg ); 
 
If I run this script against a web page where I've highlighted some DNA text, I get:



The nice part is, you can put the above code in a bookmarklet, associate the bookmarklet with a button, and keep it in your bookmark bar so that whenever you want to run the code, you can just point and click. To do the packaging, reformat the above code (or your modified version of it) as a single line of code preceded by "javascript:" (don't forget the colon), then set that code as the URL of a bookmark. Instead of going to a regular URL, the browser will see "javascript:" as the URL scheme and execute the code directly.

Bookmarklets of this sort have proven to be a major productivity boon for me in various situations as I cruise the web. When I see data I want to analyze, I don't have to copy and paste it to Excel (or whatever). With a bookmarklet, I can analyze it instantly, sur la vitre.



reade more... Résuméabuiyad

More Science on the Desktop

Not to keep harping on the amazing power of desktop omics tools, but I thought I'd share a tip for those of you into genome-mining. The tip in a nutshell is that if you gang-load a bunch of FASTA sequences (DNA sequence data) into the FeatView form at http://genomevolution.org, then click the rather inconspicuous button labeled "Phylogeny.fr" at the bottom left of the FeatView page, you'll be taken automatically to http://www.phylogeny.fr, where you'll get a realtime-generated phylogenetic tree based on the sequence data you provided in FeatView, with no effort on your part (it's truly a one-click operation). Copy and paste DNA sequences into FeatView, click one button, and 30 seconds later a tree shows up on your screen, looking (perhaps) something like this:


The reason I made this tree is that I wasn't satisfied with my knowledge of the relatedness of certain weird microorganisms I've recently run into. Namely:
  • Ralstonia (which I mentioned yesterday), WEIRD BECAUSE: It turns hydrogen gas and CO2 into plastic.
  • Bordetella, a bronchial infection agent; WEIRD BECAUSE: It turns out to be very similar, genetically, to Ralstonia
  • Burkholderia, a soil organism (and human and animal pathogen), WEIRD BECAUSE: It has an unexpectedly large amount of genetic similarity to Ralstonia and Polynucleobacter
  • Polynucleobacter, a ditch-water bacterium, WEIRD BECAUSE: It can live as an intracellular parasite of freshwater ciliates or it can live independently in soil (making it potentially a great study organism for determining the genetic bases of intracellular symbiosis)
  • Thiomicrospira, a very tiny CO2- and sulfur-loving organism, WEIRD BECAUSE: It can only be found near deep-sea thermal vents (see my previous writeup)
  • Polaromonas, a relatively newly discovered and still poorly understood bacterium, WEIRD BECAUSE: It is abundant in glacier ice on multiple continents. Plus it has an amazing (and totally unexpected) amount of genetic overlap with our good friend Bordetella, the whooping-cough bug.
If you're not familiar with how bacterial classification works, let's just say it's a mess. There's a long historical tradition of classifying microorganisms based on a hodgepodge of ad hoc methods involving everything from physical appearance under the microscope (especially after staining with crystal violet), to the habitat of the organism, to its ability to metabolize various substances, its ability to make spores, adaptation to oxygen or lack of oxygen, serological characteristics, etc. It's always been an error-prone system, resulting in many misclassifications and later corrections, owing to its inconsistency and basic irrationality, to put it bluntly. With the advent of molecular genetic techniques, it's now possible to create accurate phylogenies based on little more than DNA sequence differences, usually involving the 16S ribosomal RNA (more here).

Freshwater ciliates (like this Euplotes) are
home for Polynucleobacter endosymbionts.
As big an advance as ribosome-based phylogeny is, it's pretty far from ideal (IMHO), mainly because it ignores phenotypes. In fact it's pretty far removed from anything at all having to do with an organism's ecology, metabolism, mode of living, etc. What are we really measuring when we measure relatedness according to a 16S ribosomal yardstick? Just the rate of random mutation accumulation in a pretty uninteresting cell artifact. I'd rather have a yardstick that's tied to phenotypic reality than to a slow-to-change, "highly conserved" piece of cold dead scaffolding.

So to create my own "family tree" of two dozen or so microbes, I said to hell with 16S ribosomes and decided to use, as my yardstick, genetic variation in the
GroEL gene, which codes for the 60-kiloDalton heat-shock protein. I chose this protein (or rather, the gene for it) as my phylo-yardstick for a number of reasons. First, the DNA sequence is sizable, at about 1643 nucleotides (making it somewhat bigger than the 16S rDNA). It's important to have a large yardstick gene when looking for faint genetic signals. Secondly, this protein is essentially universal in prokaryotes. It's ubiquitous but not necessarily highly conserved, in the same sense that rRNA is highly conserved. ("Highly conserved" is not what you want. Think about it. Taken to the extreme, a "highly conserved" sequence is invariant. It never changes. And is therefore useless for phylogenetics.) Thirdly, the GroEL heat-shock protein has multiple intracellular touchpoints: It's known to interact with GroES, ALDH2, and dihydrofolate reductase, and it's involved in signal tranduction (it's induced not just by heat but by hydrogen peroxide). Not to overlook the obvious, but it is also a touchpoint protein for any enzyme that can be repaired by the 60kDa heat shock protein. That's probably dozens if not hundreds of enzymes. Why is that important? Think about it: A protein that is sensitive to the 3D conformational requirements of other proteins has to evolve in response to the needs of all the proteins it services. A thermophile (Thermomicrospira)  is going to need a different heat-shock repair system than a psychrophile (Polaromonas). A salt-lover needs a different one than a freshwater-lover. GroEL has to reflect, in its own structure, the many shifting requirements of the host proteome. These considerations make GroEL a highly appropriate basis gene for phylogenetic analysis.

And frankly, I think the GroEL-based phylo-tree phylogeny.fr spit out for me (see illustration further above) speaks for itself. It's a remarkably informative (and accurate) tree. GroEL evolutionary differences not only accurately grouped endosymbionts together, soil organisms together, aquatic organisms, etc., it also correctly grouped the "enteric-alike" Erwinia with E. coli and Shigella, and it cannily put Polaromonas with soil organisms (rather than aquatics), which I think is correct, based on recent Polaromonas isolates being found in soil rather than snow. Likewise, it's good to see Bdellovibrio (a freshwater bug) clustered with Polynucleobacter (which is symbiotic with a ciliate protozoan), with Thiomicrospira (the saltwater hydro-vent organism) a very nearby out-node.

If you get an infection while in a hospital, pray
it's not Clostridium difficile, which is often deadly.
A harder call to make is Clostridium difficile, which is present in 1% to 5% of non-ill people's intestines. Is it an enteric (a la E. coli)? Definitely not. The Clostridia (botulism, tetanus, etc.) are spore-forming soil bacteria. Their placement in the tree not far from the soil-dwelling spore-former, Bacillus thuringensis, is thus eminently correct. Bacillus is a proximal out-node relative to Clostridium, which is understandable in that Bacillus is aerobic whereas Clostridia are strict anaerobes.

Buchnera
(an aphid symbiont) comes at an odd location, much further away from the insect-dwelling Wolbachia than I would have predicted, but then again Buchnera's host feeds on cold sap where Wolbachia's hosts typically feed on warm blood. All the organisms around Wolbachia in the tree are hemophiles.

Our good friend
Bordetella (of pertussis fame) is placed firmly in the soil group. I think that's real and significant. When you start to look at Bordetella's high DNA sequence similarity with Ralstonia and Burkholderia, it would be surprising, actually, if it fell anywhere else in the tree.

Honestly, when I took Bacterial Ecology 201 in college, many years ago, it was under duress and I hated the experience. But now, decades later, I'm starting to like it. With tools like those available for free at
http://genomevolution.org and http://www.phylogeny.fr, what's not to like?
reade more... Résuméabuiyad

Deep-Sea Vents: The Mosquito Connection

Quick: What species of life on earth is the most abundant? (Which species has more living members than any other species?) Hint: If an alien probe lands in a random location on earth, chances are better than 70% that the probe will encounter this organism.

If you're thinking in terms of the ocean, you're on the right track. What may surprise you is the connection between the world's-most-populous-organism (to be revealed shortly) and the mosquitoes that've been dive-bombing your neck all week. Equally amazing is the link between the mosquitoes in your back yard and hydrothermal vents in the ocean floor.

The hundreds of bright little particles at the
narrow end of this wasp egg are Wolbachia cells.
I wasn't thinking about marine biology or deep-sea hydrothermal vents when I went online at http://genomevolution.org the other day to do a little nosing around into the genome of Wolbachia pipientis, the ultra-tiny bacterial parasite carried by nearly every mosquito on earth. (Caution: Don't attempt the following DNA-analysis tricks on your own unless you want to become thoroughly addicted to desktop omics. I'm a microbiologist by training. I can do these stunts safely.) "Parasite" is actually the wrong word. Our tiny friend Wolbachia doesn't just parasitize the mosquito; it's an integral part of the mosquito. Wolbachia can't live outside its insect host—and guess what? The host frequently can't live without Wolbachia. The two provide essential services for each other, an arrangement known as mutualism.

I would argue that Wolbachia is more than a mutualistic symbiont: It's a proto-organelle, something very close to what Lynn Margulis had in mind as the ancestor of today's mitochondrion.

Wolbachia can't live on its own in the outside world (as far as anybody knows): it needs to live inside a host (generally an arthropod, although filarial worms also carry Wolbachia). Inside its host it occupies a very special niche: It lives in the nursery cells of the insect's ovary—the cells that will go on to become egg cells.

This is no ordinary symbiosis. I mentioned in an earlier post that Wolbachia carries with it genes for reverse-transcriptases, resolvases, recombinases, transposases, translocases, DNA polymerases, RNA polymerases, and phage integrases—a complete suite of retroviral machinery, designed for export of foreign DNA into host DNA. And indeed, researchers have found that Wolbachia DNA is quite often embedded in the host's own nuclear DNA. (One group, looking at four insect hosts and four nematode hosts, found anywhere from 500 base-pairs to over a million base pairs of Wolbachia DNA residing in the nucleus. Another group found 45 Wolbachia genes incorporated in a fruit-fly host's nuclear DNA.) The situation with Wolbachia thus parallels the situation with mitochondria, where we know that 97% of the gene products that go to make up a mitochondrion are actually encoded in nuclear DNA, not mitochondrial DNA.

When you encounter an organism as baffling as Wolbachia, oftentimes you want to know what its relatives are—what it's most closely related to. When a new or poorly understood organism has a close relative that's already well-studied, sometimes you learn a lot in a hurry. That's particularly true of pathogens (not that Wolbachia is a pathogen per se). Pathogens have virulence strategies of various kinds. Maybe Wolbachia has symbiosis strategies that it learned from a relative?

The problem with a lot of the super-tiny microbes (which Wolbachia definitely is, with only a quarter as much DNA as E. coli) is that their relatedness is not always well understood. Organisms are assigned a taxonomic slot, then the assignment changes a few years later, after they're better-studied. (So for example, Cowdria ruminantium was eventually renamed Ehrlichia ruminantium, and a bunch of former Ehrlichias are now Neorickettsias, except the ones that attack red blood cells, which are now Anaplasmas.) Taxonomy at this end of the evolutionary tree is definitely a work in progress.
Deep-sea thermal vents like this one
are home to organisms like Thiomicrospira
that can grow on sulfide, CO2, and basic salts.

Fortunately, it's easy nowadays (what with so many organisms' DNA sequences available online) to go on the web and compare genomes directly, using a tool like SynMap, which is what I started doing with Wolbachia. I started going down the list of mini-microorganisms and began running DNA similarity tests of Wolbachia against Ehrlichia, Neorickettsia, Anaplasma, Chlamydia, and "the usual suspects" at the ultra-small-chromosome end of the tree of life.

What I found surprised me. A bizarre little bacterium called Thiomicrospira kept showing up in my BLAST searches as having many genes in common with Wolbachia (based on sequence matches in large numbers of genes). None of the taxonomy charts showed the two to be related. But DNA doesn't lie. I kept coming up with matches across hundreds of genes. (Bear in mind, Wolbachia has only about 1300 genes to begin with, which is very small, even for a bacterium.)

What's bizarre about Thiomicrospira is that it's one of those fairly newly discovered microbes that lives on sulfur, heat, and CO2 at the bottom of the ocean, in total darkness, in the vicinity of thermal vents. Thiomicrospira is the kind of life form NASA takes a great interest in, because it could be a prototype for exactly the type of survive-in-the-dark CO2-using organism that might live under the ice crust of Europa (Saturn's moon). In theory, there could be geothermal vents on the floor of the large ocean of liquid water that NASA is pretty sure exists under Europa's ice. If there's life down there, it could very well look like Thiomicrospira.

But why should Thiomicrospira have so many genes in common with a mosquito symbiont? Thiomicrospira organism lives at the bottom of the ocean; Wolbachia lives inside arthropod eggs. One obtains its carbon in the form of CO2; the other produces CO2 as a waste product. One is adapted to live in warm salt water; the other lives in cold-blooded insects. In theory, these two germs couldn't be further apart. And yet, oddly enough, they not only have hundreds of genes in common, the genes are well-matched from a DNA sequence-similarity standpoint. Thiomicrospira's DNA even incorporates a prophage module, and some of its phage genes show a high percentage base-pair similarity with the phage genes of Wolbachia. (See screen shot below.)
Remarkably, Thiomicrospira and Wolbachia share certain phage genes in common, as shown here. The genes have a DNA sequence identity of about 60%.
After doing a little more detective work, I found an organism that might very well form a "missing link" between the mosquito symbiont and the thermal-vent dweller. This organism kept showing up in my analyses as having a high degree of DNA similarity with both Thiomicrospira and Wolbachia. The organism in question is Pelagibacter ubique (now known as Candidatus pelagibacter, although some might question this taxonomic assignment since all other Candidatus members are obligate intracellular symbionts), and it's an astonishing organism in two ways: First, it's the smallest non-parasitic (free-living) bacterium known to science, with only 1.3 million base-pairs in its DNA (making it slightly smaller than Wolbachia and its tiny cousins). Secondly, it's the most numerous living thing on earth. It's present in large amounts in every one of earth's oceans.

Pelagibacter was placed in the Candidatus clade in 2007 due to its small genome and cell size and certain ribosomal markers. It has a very mitochondria-like genetic profile, and in fact some people think Pelagibacter is the ancestor of today's mitochondrion, a theory that's all the more satisfying when you consider that Pelagibacter is both ancient and tied to the sea.

My analysis using SynMap found that Pelagibacter and its thermal-vent-dwelling cousin Thiomicrospira share about 660 genes (out of 1480 or so for Pelagibacter), whereas Wolbachia and Pelagibacter share around 581, and Thiomicrospira and Wolbachia share around 1000. These are so-called non-syntenous point matches between genes; instances where the same gene occurs in both organisms, with a high percentage of base-pair matching. Synteny is a concept that takes gene-matching one step further and says that clusters of similar genes are what count. Synteny at the level of higher plants and animals is one thing, but at the level of a mini-microbe it tends to lack meaning, because the genes of bugs like Wolbachia are notoriously mobile: They find new positions on the chromosome over time (probably because of the large number of transposases, nucleases, and integrases in the genome). Even so, I decided to carry out a bit of syntenic analysis to see what I could find out.

For purposes of my analysis I defined a "syntenon" as three or more co-proximal genes that match three or more genes on the other organism's genome. But to be part of a syntenon, all three genes in a triplet have to occur within a 30-gene span (and match 3 genes in a 30-gene span on the other organism's DNA) plus the genes have to be in the same order in both organisms.

A planet-spanning waterworld is thought to exist under
Europa's icy outer crust. If thermal vents exist at the
bottom, any life that exists may look a lot like Thiomicrospira.
Using SynMap, I found that whereas Wolbachia and Pelagibacter share around 157 syntenic genes, and Thiomicrospira and Wolbachia share around 132, Thiomicrospira and Pelagibacter share 250 (which makes sense in that both are ocean-dwellers). For comparison-and-control purposes, I did a triplet match of Thiomicrospira against another chemoautotroph (an organism that gets energy from inorganic chemicals, and carbon from CO2), namely Methanothermobacter marburgensis. There were only 53 syntenic triplets in common between the two chemoautotrophs. (Between Wolbachia and Methanothermobacter, on the other hand, there were only 3 triplet-matches.) Doing a match between two Wolbachia species (a mosquito-dwelling variety and a fruit-fly-dwelling cousin) produced 522 gene matches in syntenic triplets.

It seems reasonable to me, based not just on the previous sorts of analysis but also direct inspection of the genomes (in terms of their respective protein products), that Thiomicrospira evolved from PelagibacterPelagibacter is the most abundant life form in the ocean, and perhaps the oldest. Pelagibacter is also very mitochondria-like, and so is Thiomicrospira, which has rhodanese-like proteins, the full cytochrome system, redox enzymes, citric-acid-cycle enzymes, plus certain characteristic membrane and sensor proteins, flippases, etc. (For what it's worth, Thiomicrospira has the highest signal-transduction profile I've ever seen at http://mistdb.com, again making it very mitochondrial-feeling.)

I'm tempted to say, similarly, that Thiomicrospira and Wolbachia are related. They have phage proteins in common. They both have genes for patatin proteins. They share multiple drug resistance genes. (That's not so strange. Antibiotics occur naturally in the environment.) They share genes for Flp-type pilins. Plus many more coincidences, big and small.

At first blush, a deep-sea thermal vent seems pretty far removed, environmentally, from the egg cell of a mosquito. How to reconcile the difference? Actually, I see similarities. Thiomicrospira thrives at temperatures of 28 to 32 degrees Celsius (which is also true of mosquitoes, although they prefer the 28-degree end of the scale). And blood (the preferred food source for mosquitoes) is comparable in pH and salinity to seawater. Also, mosquitoes have an aquatic lifecycle: they require brackish water in which to lay eggs. Mosquitoes and salt marshes go back millions of years.

It's even possible that Wolbachia might live in deep-sea-vent-dwelling host organisms. In fact, I predict they will be found there. Why? Because in addition to inhabiting flying insects, spiders, mites, and ticks (and filarial worms), Wolbachia have also been found in a very high percentage of crustaceans. We know that crustaceans are often found living near deep-sea thermal vents; and many crustaceans show the characteristic feminization of genetic males that's so often the tipoff to a massive Wolbachia presence in insect populations.

Insects and crustaceans represent two of the oldest, most successful, and most widely distributed life forms of the animal kingdom. Would it really be so surprising if the bacteria that colonize these life forms are closely related to the most common marine bacteria on the planet? I don't think so. Stranger things have happened.



reade more... Résuméabuiyad