Wednesday, July 4, 2012

Time inconsistency in evolutionary networks


The temporal ordering of the nodes (and branches) is usually treated as an important feature in an evolutionary network of biological organisms, because the order must be time consistent (Baroni et al. 2004, 2006; Moret et al. 2004). That is, for reticulation events the "horizontal" gene flow can only occur between species that are contemporaries. So, speciation events occur successively but reticulation events occur instantaneously (Sang and Zhong 2000).

For example, it would be unrealistic to hypothesize either a hybridization or a horizontal gene transfer event between a species and one of its own ancestors. Furthermore, each reticulation event must not only be consistent on its own but must be consistent in relation to all of the other events.

Mathematically, inconsistency creates directed pseudo-cycles in the network graph, so that it is not acyclic, as required for an evolutionary history (see previous blog post). Time consistency is thus seen as a useful means of validating a network as a potential biological history, and can even be used as a criterion for choosing among otherwise equally optimal networks.

However, evolutionary analysis is not applied only to biological organisms. It has also been applied to the study of languages (Atkinson & Gray 2005) and to cultural objects (Collard et al. 2006). Indeed, Darwin himself recognized early on that it would be important to show that language (a characteristic solely of humans) had a natural origin and that it develops in a genealogical fashion (ie. it has a pedigree).

Thus, both language and cultural objects have an historical component that can be studied, and both can fit into an evolutionary framework of variation + transmission + selection (Dagg 2011). Moreover, the evolutionary history also consists of both vertical and horizontal transmission. This means that the same data-analysis techniques can potentially be applied to biology, language and culture (Heggarty et al. 2010; Gray et al. 2010).

The issue that I wish to raise here is that time consistency is not a requirement of the evolution of either language or cultural objects, the way that it is for biological organisms. Organisms store the information (that is vertically and horizontally transmitted) in genes that they carry with them, which is what restricts reticulation to occurring only between contemporaries. However, language and culture store their "information" externally, either in the minds of people or in permanent or semi-permanent records (either written or pictorial). Thus, the information available for horizontal transmission can come from the distant past, as well as from the present. *

It is important to note that for language and culture the biological ideas of vertical and horizontal transmission of genetic information need modification (Cavalli-Sforza and Feldman 1981). Vertical (or descending) transmission still involves faithful copying of the information (with perhaps some losses or minor modifications). Lateral transfer, however, can be either horizontal transmission (between contemporary generations) or oblique transmission (between different generations), and it is the latter that allows time-travel of information.

Lateral transfer in this context may be a form of hybridization, in which new concepts are added from elsewhere (eg. synonymous words), but is likely to be a form of HGT in which concepts are simply replaced with something from elsewhere (eg. a new word effectively replaces an old word). Recombination, in which concepts are mutually exchanged, may be rather rare.

As an illustration, Dagg (2011) provides some interesting examples of lateral transfer in the parts of mouse traps. For example, he notes that: "Torsion power may have been transmitted laterally from Egyptian torsion traps to prefabricated dead-fall traps." These traps need not be contemporaneous, because the ideas being transferred may be from pictures or descriptions of old traps rather than from concurrently existing traps. (Joachim Dagg also has a couple of blog posts where he further discusses the evolution of mouse traps: post 1 —  post 2.)

As an alternative example, Johnson et al. (1989) provide an evolutionary network showing the history of the various software (mostly) and hardware components of the revolutionary Xerox 8010 "Star" Information System (ie. computer), introduced in April 1981. Note that almost all of the lateral transfer events (single arrows; mostly hybridization) are time inconsistent. To quote the authors: "Although Star was conceived as a product in 1975 and was released in 1981, many of the ideas that went into it were born in projects dating back over three decades."

Fig. 8 – How systems influenced later systems.
This graph summarizes how various systems related to Star have influenced one another over the years. Time progresses downwards. Double arrows indicate direct successors (i.e., follow-on versions). Many "influence arrows" are due to key designers changing jobs or applying concepts from their graduate research to products.

The implications of time-travelling laterally transferred information for network construction methods may be unfortunate, in the sense that evolutionary networks in biology may be quite different from those for language and culture, with the latter pair requiring somewhat different methods. At a minimum, the requirements for choosing among alternative networks will be different.

A quick look at the current literature involving network analysis of languages and cultural artifacts shows an almost universal use of unrooted graphs, most often a Neighbor-Net, Reduced-Median or Median-Joining network. Such networks cannot directly represent evolutionary history because there is no time direction in the graph. This type of analysis thus neatly side-steps the issue of representing time-travelling information in an evolutionary diagram; and it suggests that social scientists have not yet considered the consequences of the potential lack of time consistency in their data.

* Footnote: I suppose that I should be precise, and note that a modern gene bank does allow genetic information to time travel, as well.

References

Atkinson QD, Gray RD (2005) Curious parallels and curious connections — phylogenetic thinking in biology and historical linguistics. Systematic Biology 54: 513-526.

Baroni M, Semple C, Steel M (2004) A framework for representing reticulate evolution. Annals of Combinatorics 8: 391–408.

Baroni M, Semple C, Steel M (2006) Hybrids in real time. Systematic Biology 55: 46–56.

Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution. Princeton University Press, Princeton.

Collard M, Shennan SJ, Tehrani JJ (2006) Branching, blending, and the evolution of cultural similarities and differences among human populations. Evolution and Human Behavior 27: 169–184.

Dagg JL (2011) Exploring mouse trap history. Evoluton: Education and Outreach 4: 397–414.

Gray RD, Bryant D, Greenhill SJ (2010) On the shape and fabric of human history. Philosophical Transactions of the Royal Society of London series B 365: 3923-3933.

Heggarty P, Maguire W, McMahon A (2010) Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories. Philosophical Transactions of the Royal Society of London series B 365: 3829-3843.

Johnson J, Roberts TL, Verplank W, Smith DC, Irby C, Beard M, Mackey K (1989) The Xerox "Star": a retrospective. IEEE Computer 22: 11-29.

Moret BME, Nakhleh L, Warnow T, Linder CR, Tholse A, Padolina A, Sun J, Timme R (2004) Phylogenetic networks: modeling, reconstructibility, and accuracy. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1: 13–23.

Sang T, Zhong Y (2000) Testing hybridization hypotheses based on incongruent gene trees. Systematic Biology 49: 422–434.

2 comments:

  1. Nice post.
    I'd add one comment: It is important to distinguish between the true evolutionary history, which has to be time consistent, and the reconstructed one, which may have a reticulation edge from an ancestor to descendant, simply due to incomplete taxon sampling (or extinction). In other words, except for ensuring acyclicity, I don't think one needs impose time consistency constraints during inference.

    Luay Nakhleh

    ReplyDelete
    Replies
    1. Thanks, Luay. You are right that it is important to make the distinction; and it is always possible to add "ghost" lineages to account for apparent time inconsistency in a reconstructed network. However, consistency can be a valuable criterion for choosing among reconstructions, as Leo van Iersel and I discussed in an earlier post (May 8, 2012).

      Delete