Monday, February 1, 2016

Tardigrades and phylogenetic networks


In this blog we have always championed the use of Exploratory Data Analysis prior to phylogenetic analyses. This approach explores the characteristics of the data before making formal inferences about possible evolutionary scenarios. One of the reasons for doing this is the possibility of data errors. That is, we need to distinguish between estimation errors deriving from our experimental procedures and real biological scenarios, because both of these will result in complex patterns in our data.

One possible classification of the potential causes of complex data patterns in phylogenetics is this:

Estimation errors
(i) incorrect data
— inadequate data-collection protocol
— poor laboratory / museum / herbarium technique
— lack of quality control after data collection
— misadventure
(ii) inappropriate sampling
— distant outgroup
— rapid evolutionary rates
— short internal branches
(iii) model mis-specification
— wrong assessment of primary homology
— wrong substitution model
— different optimality criteria

Biological complexity
(iv) analogy
— parallelism
— convergence
— reversal
(v) homology
— deep coalescence
— duplication–loss
— hybridization
— introgression
— recombination
— horizontal gene transfer
— genome fusion

The scientific literature has a number of prime examples where people have asserted a case of biological complexity that has subsequently been questioned, and attributed to estimation errors instead.

For example, many of you will have noted the recent attention given to the release of various genome sequences from the Tardigrades, a group of microscopic animals often alleged to be the world's most resistant to environmental conditions. Two rival papers have appeared:
Thomas C. Boothby et al. (2015) Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. Proceedings of the National Academy of Sciences of the USA 112: 15976–15981.
Georgios Koutsovoulos et al. (2015) The genome of the tardigrade Hypsibius dujardini. BioRxiv preprint 33464.
The former paper attributes their observed phylogenetic complexity to horizontal gene transfer (group v in the list above) while the latter attributes it to sequencing errors (group i). This situation is discussed in more detail elsewhere on the web, for example:
Rival scientists cast doubt upon recent discovery about invincible animals
How did these indestructible pond critters get their genes?
This difference in possible cause (of complexity) matters particularly for the use of phylogenetic networks, because both estimation errors and biological complexity will appear as reticulation patterns in any network. This is particularly important for the assertion of evolutionary scenarios such as horizontal gene transfer, because usually the only evidence for any such gene flow is the complexity of the phylogenetic network — that is, there is no independent experimental evidence, and we are relying entirely on the phylogenetic pattern analysis. Estimation errors must thus be eliminated prior to the phylogenetic analysis, if we are to produce a high quality network.

The current situation potentially has unfortunate consequences. For example, there are continual comments that horizontal gene flow is rare, particularly from zoologists, even though there is a large amount of evidence to the contrary. Situations like the current one can only add fuel to this argument, if strong claims of gene flow turn out to be erroneous. There is no quantitative basis for an assertion that gene flow is rare in zoology — those who have looked for reticulate evolution in animals have found it, and those who haven't haven't.

In the end, data-display networks are useful for displaying incongruent data patterns, but the source of the incongruence needs to be identified before these networks are turned into evolutionary networks (either explicitly drawn or verbally implied).

Monday, January 25, 2016

What is in Traditional Chinese Medicines — a network analysis


I used to work for the New South Wales Institute of Technology, which in the late 1980s mutated into the University of Technology Sydney (UTS). During this process it acquired an organization called the College of Traditional Chinese Medicine. This group was placed in the Faculty of Science, for lack of anywhere else to put it.

These people had little contact with the rest of the faculty, and I don't recall ever meeting any of them. Indeed, their work was not really based on Western science. These days, the UTS College of Traditional Chinese Medicine offers a Bachelor of Health Science in Traditional Chinese Medicine, although they are most obvious in the UTS Chinese Herbal Medicine Clinic, which is also nominally still part of the Faculty of Science.

The presence of Traditional Chinese Medicine (TCM) in an Australian university setting is relevant to today's blog post, because Australia seems to be one of the few places to have shown any interest in connecting TCM and Western science. Indeed, there is also a Uniclinic of Traditional Chinese Medicine within the School of Science and Health at Western Sydney University. Most of the interest in studying TCMs has otherwise been confined to Asia (see Dennis Normile. 2003. The new face of Traditional Chinese Medicine. Science 299: 188-190).


Recently, a group of Australian researchers decided to have a look at the content of some of the TCMs available in their country:
Megan L. Coghlan, Garth Maker, Elly Crighton, James Haile, Dáithí C. Murray, Nicole E. White, Roger W. Byard, Matthew I. Bellgard, Ian Mullaney, Robert Trengove, Richard J.N. Allcock, Christine Nash, Claire Hoban, Kevin Jarrett, Ross Edwards, Ian F. Musgrave & Michael Bunce (2015) Combined DNA, toxicological and heavy metal analyses provides an auditing toolkit to improve pharmacovigilance of traditional Chinese medicine (TCM). Nature Scientific Reports 5: 17475.
Some of these TCMs (12 out of 26) are registered for use with the Therapeutic Goods Administration, which regulates their use within Australia, while the other TCMs are not (which technically means that they should not have been commercially available). However, there is little in the way of pharmacovigilance of herbal medicines anywhere in the world.

All of the products were comprehensively audited for their biological (via next generation DNA sequencing), toxicological (LC-MS analysis) and heavy metal (arsenic, cadmium and lead, via SF-ICP-MS analysis) contents. For the latter two analyses the amount of material detected was also quantified.

As usual, we can use a phylogenetic network to visualize these data, which I have done using a neighbor-net network on the presence-absence data. The result is shown in the figure. TCMs that are closely connected in the network are similar to each other based on their detected contents, and those that are further apart are progressively more different from each other. The registered products are highlighted in red.


There is wide variation among the products. The seven most divergent TCMs in the network are all unregistered, with the remaining seven being more similar to the registered TCMs. Only two TCMs (TCM10 and TCM17) have no discrepancies between the detected contents and what was declared (either to the regulatory agency, or to the consumer in the form of an ingredients list).

The authors summarize this situation:
Genetic analysis revealed that 50% of samples contained DNA of undeclared plant or animal taxa, including an endangered species of Panthera (snow leopard). In 50% of the TCMs, an undeclared pharmaceutical agent was detected including warfarin, dexamethasone, diclofenac, cyproheptadine and paracetamol. Mass spectrometry revealed heavy metals including arsenic, lead and cadmium, one with a level of arsenic >10 times the acceptable limit.
This study presents genetic, toxicological, and heavy metal data that should be of serious concern to regulatory agencies, medical professionals and the public who choose to adopt TCM as a treatment option. Of the 26 TCMs investigated, all but two can be classified as non-compliant on the grounds of DNA, toxicology and heavy metals, or a combination thereof. In total, 92% were deemed non-compliant with some medicines posing a serious health risk.
Such findings are not only of concern to the consumer, but also flag the need for detailed auditing of herbal preparations prior to evaluation in clinical trials.