Latest News

How Did 1 Create It – A Brusk Direct To A Overnice Graph

At the halt of the 20th in addition to into the novel 21st century, phylogenies accept been largely reduced to stick graphs, frequently quite unappealing ones. In the papers I co-authored, I ever tried to heighten the graphics, in addition to I accept non rarely been asked how I create it. So here's my protocol for a footling basic tree-and-networks magic.


These days, when nosotros recall of a phylogeny, nosotros normally hateful a one-dimensional stick graph, a phylogenetic tree much unlike from Haeckel's famous oak (see this GWoN post for a curt introduction into tree metaphors in addition to farther links).

Evolution is – inward essence – modify over time. Hence, it should live a criterion to demo (or at to the lowest degree document) the phylogram – the tree amongst branch lengths reflecting the amount of change, or genetic or other distances. Unfortunately, many publications yet study only cladograms, trees without branch lengths.

The figure that got me a Nature co-authorship (Friis et al. 2007, fig. 3). See also this post


A possible phylogeny of extinct in addition to extant cycads (Grimm, 1999) The branch lengths inward this spiral tree are non proportional, but reverberate the amount of change.

Molecular development of the internal transcribed spacers inward beech, exemplified at paw of 4 sequence patterns (coloured version of Fig. 8 eventually published inward Grimm, Denk & Hemleben 2007).
Even when I yet did "cladistic" analyses (for my Diplom thesis), I tried to demo my results inward a to a greater extent than pleasant, visually appealing way. Without knowing it, I entered real early on the realm of tree metaphors.

My kickoff phylogenetic newspaper (Denk et al. 2002) naturally included but a criterion phylogram, much inward contrast to my doctoral thesis (Grimm 2003), which is quite packed amongst trees (straightforward or metaphorical ones) in addition to graphics trying to visualise development rather than phylogeny.

I ever felt phylogeneticists should non halt yesteryear inferring a tree, but a) endeavor to empathize its basis, the underlying data, b) to acquaint a processed shape of the inference results, in addition to c) demo all the information has to offer. The latter guide forced me into using networks. Mainly splits graphs such equally the distance-based neighbour-net (Bryant & Moulton 2002, 2004), planar meta-phylogenetic networks, in addition to back upwards consensus networks (Holland & Moulton 2003; Schliep et al. 2017) based on Bayesian-sampled topologies or bootstrap (BS) pseudoreplicates.

Eventually, I managed to issue my kickoff non-tree graphs inward 2006
Some networks for maples (Grimm et al. 2006, figs 2, 3; open access). Left, distance-based neighbour-nets; right, a Bayesian posterior probability network ('bipartition network')

And inward the next decade (till the halt of my scientific career inward 2016), whatever newspaper amongst me equally co-author would include something similar this (at to the lowest degree inward the supplement)
2ISP-aware neighbour-nets amongst bootstrap supports mapped for selected phylogenetic splits (Potts et al. 2014, fig. 3)

You desire that, too? Here's how I did it.

My criterion piece of work pipeline

Dealing amongst non-trivial information sets, I before long realised that Bayesian inference has footling to offer. It is a method to infer a tree in addition to assess the probability of its branches. Other algorithms also infer trees, but bootstrapping (or jack-knifing) accept the potential to capture conflicting signals inward the data. Bayesian posterior probabilities volition normally disputation to 1 alternative. For the 2006 paper, I was brought into contact amongst Alexandros Stamatakis, who but finished his doctoral thesis in addition to programmed RAxML III, in addition to since so RAxML (now RAxML 8) has been my prime number alternative to analyse my data.

Ideally, y'all accept a LINUX organization (or calculator cluster), so y'all tin compile the almost recent version of RAxML that tin live found at GitHub. I frequently used the Windows-executables, which dainty people compile from time-to-time (also included inward the GitHub file list) on my stand-alone calculator when dealing amongst quite minor information sets.

The batch file for my criterion analyses of oligogene information sets (here: a information laid for Drosanthumum, an analysis I did for my terminal professional person scientist paper) includes the next code lines (gray background). The input was a oligogene (5-gene) matrix labelled "DrosOnly.all.epf" inward (extended) PHYLIP format ("extended" means, the names accept no length restrictions equally inward the classic PHYLIP format); "1_ITS", etc. are exclusion files, "part" is the file defining the partitions (see RAxML's manual for formatting in addition to syntax). The installed RAxML Windows/DOS executables (needs to live inward same folder equally the batch file) is "raxmlHPC-PTHREADS-SSE3"

Step 1: Cutting the combined matrix into its element genes
raxmlHPC-PTHREADS-SSE3 -T 1 -s DrosOnly.all.epf -m GTRCAT -n xx -E 1_ITS
raxmlHPC-PTHREADS-SSE3 -T 1 -s DrosOnly.all.epf -m GTRCAT -n xx -E 2_trnSG
raxmlHPC-PTHREADS-SSE3 -T 1 -s DrosOnly.all.epf -m GTRCAT -n xx -E 3_rpl16
raxmlHPC-PTHREADS-SSE3 -T 1 -s DrosOnly.all.epf -m GTRCAT -n xx -E 4_tQr16
raxmlHPC-PTHREADS-SSE3 -T 1 -s DrosOnly.all.epf -m GTRCAT -n xx -E 5_r16tK
raxmlHPC-PTHREADS-SSE3 -T 1 -s DrosOnly.all.epf -m GTRCAT -n xx -E 6_cpAll -q part
The terminal code business generates a plastid-only dataset. When combining information from unlike genomes, e.g. nuclear in addition to plastid, y'all ever should run mututally exclusive tree in addition to branch-support analyses to essay for congruence: commonly used tests similar the ILD essay volition non recognize incongruence in addition to neglect to seat jumping taxa, so-called "rogue taxa" inward almost cases.

Step 2: Inferring ML trees in addition to flora branch-support via fast bootstrapping; number of necessary bootstrap replicates determined yesteryear Pattengale et al.'s 2009 extended bulk dominion bootstop criterion; the -T alternative determines how many processors volition live used for parallelisation; I accept a tetra-core i7, so dedicating 3 or 4 leaves me 5 or vi for running other programmes.

raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf -m GTRCAT -n 0_all.q -f a -p 32123 -x 65489 -# autoMRE -q part
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf -m GTRCAT -n 0_all.noq -f a -p 32123 -x 65489 -# autoMRE
These 2 lines invoke a full-partitioned in addition to unpartitioned analysis. It has larn a fashion to essay models, in addition to e.g. usage PartitionFinder to define partitions: for oligogene datasets this is a waste matter of time, in addition to may fifty-fifty live counterproductive. Hence, I but run a total (and biologically/genetically meaningful) partitioned analyses (e.g. noncoding vs. coding, inward instance of protein-coding genes one/two partitions for 1st in addition to 2nd, in addition to about other for 3rd codon position) in addition to an unpartitioned 1 equally extreme endpoints. One observation I made is that they are normally non that unlike inward the result.]
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.1_ITS -m GTRCAT -n 1_ITS -f a -p 32123 -x 65489 -# autoMRE
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.2_trnSG -m GTRCAT -n 2_trnSG -f a -p 32123 -x 65489 -# autoMRE
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.3_rpl16 -m GTRCAT -n 3_rpl16 -f a -p 32123 -x 65489 -# autoMRE
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.4_tQr16 -m GTRCAT -n 4_tQr16 -f a -p 32123 -x 65489 -# autoMRE
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.5_r16tK -m GTRCAT -n 5_r16tK -f a -p 32123 -x 65489 -# autoMRE
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.6_cpAll -m GTRCAT -n 6_cp.q -f a -p 32123 -x 65489 -# autoMRE -q part.6_cpAll
raxmlHPC-PTHREADS-SSE3 -T 3 -s DrosOnly.all.epf.6_cpAll -m GTRCAT -n 6_cp.noq -f a -p 32123 -x 65489 -# autoMRE

The number is a collection of trees (single-gene in addition to combined) in addition to bootstrap analyses (single-gene in addition to combined), partitioned in addition to unpartitioned. Since it's rapidly done, in addition to at that spot are options to persuasion in addition to compare the results (such equally bootstrap consensus networks, run into e.g. Schliep et al. 2017), at that spot is no excuse for non doing a total analysis these days. H5N1 total analysis should include single-gene inferences in addition to genome-exclusive inferences to brand certain the combined tree is non riddled yesteryear branching artefacts due to conflicting or partly incompatible signal. (You tin detect batch/shell files in addition to according information input in addition to RAxML output files inward the supplementary information archives to my to a greater extent than recent papers including phylogenetic analyses, e.g. [Osmundaceae] [Loranthaceae1] [Loranthaceae2]; more here)


For the trees, I ever opened in addition to viewed the "RAxML_branchlabelled" tree inward GWoN; related posts)
When y'all usage the filled bubbles (circles, squares), brand certain that y'all modify their outline to "invisible". Otherwise your export volition accept 2 objects per bubble, 1 for the filling in addition to 1 for the outline. Which tin live annoying for farther processing (outlines are easily changed inside the graphic programmes) 

I usage EPS equally export for farther manipulation inward Adobe Illustrator, in addition to SVG for CorelDraw (here the export in addition to import yet industrial plant without strange elements, inward contrast to Dendroscope). Note, that y'all accept to add together the proper suffix (e.g. "mynet.eps" equally filename), because SplitsTree volition non add together it automatically. Don't usage the PDF export alternative here, SplitsTree in addition to Dendroscope accept been made yesteryear the same group, but Dendroscope's PDF export includes elements, but SplitsTree's is but a down-scaled image.

Screenshot of a typical CorelDraw projection for a pimped-up bootstrap consensus network (this 1 was for Khanum et al. 2016). Basic graph construction optimised using Splitstree, exported equally SVG, in addition to opened inward CorelDraw for graphical enhancement (replacing terminal border complexes yesteryear coloured triangles, fusing back upwards information from unlike analysis (here: ML bootstrapping in addition to Bayesian PP)

What if a tree is non plenty

In instance y'all accept conflicting back upwards patterns, a neighbour-net is ever a practiced alternative to map them.

For this y'all demand a pairwise distance matrix, ideally inward PHYLIP format, in addition to opened upwards it inward SplitsTree (there are no length restriction regarding taxon names). SplitsTree tin opened upwards NEXUS-files, but the only problem-free export/import is the "simple NEXUS" alternative of Mesquite. I accept yet a re-create of PAUP*, so I usage it for calculating the distances. In principle, SplitsTree tin calculate distances guide from a grapheme matrix, but inward the yesteryear 1 could detect inconsistencies, minor errors. This perchance non a occupation anymore. Still, I recommend using a programme y'all usally rely on to calculate the pairwise distances in addition to export them equally a PHYLIP-formatted matrix for import inward SplisTree in addition to auto-calculation of the neighbour-net (additional algorithms are implemented, trees in addition to networks)

With the neighbour-net, BS consensus network(s) in addition to a graphic software at hand, y'all but seat corresponding groups in addition to map the BS back upwards on the brackets or border bundles referring to that split. And y'all tin enrich your newspaper amongst something similar this:


Happy drawing.


Some of import links

Cited in addition to selected literature
Bryant D, Moulton V. 2002. NeighborNet: an agglomerative method for the construction of planar phylogenetic networks. In: Guigó R, in addition to Gusfield D, eds. Algorithms inward Bioinformatics, Second International Workshop, WABI. Rome, Italy: Springer Verlag, Berlin, Heidelberg, New York, p. 375-391. — The master copy newspaper introducing the neighbour-net, manifestly ignored yesteryear biologists.
Bryant D, Moulton V. 2004. Neighbor-Net: An agglomerative method for the construction of phylogenetic networks. Molecular Biology in addition to Evolution 21:255-265. — The re-publication inward a biological journal.
Denk T, Grimm G, Stögerer K, Langer M, Hemleben V. 2002. The evolutionary history of Fagus inward western Eurasia: Evidence from genes, morphology in addition to the fossil record. Plant Systematics in addition to Evolution 232:213-236.
Friis EM, Crane PR, Pedersen KR, Bengtson S, Donoghue PCJ, Grimm GW, Stampanoni M. 2007. Phase-contrast X-ray microtomography links Cretaceous seeds amongst Gnetales in addition to Bennettitales. Nature 450:549-552.
Grimm GW. 1999. Phylogenie der Cycadales. Diploma thesis. Eberhard Karls Universität. http://www.palaeogrimm.org/themen/diplomarbeit/inhalt.htm. [English abstract]
Grimm GW, Renner SS, Stamatakis A, Hemleben V. 2006. H5N1 nuclear ribosomal deoxyribonucleic acid phylogeny of Acer inferred amongst maximum likelihood, splits graphs, in addition to motif analyses of 606 sequences. Evolutionary Bioinformatics 2:279–294.
Holland B, Moulton V. 2003. Consensus networks: H5N1 method for visualising incompatibilities inward collections of trees. In: Benson G, in addition to Page R, eds. Algorithms inward Bioinformatics: Third International Workshop, WABI, Budapest, Republic of Hungary Proceedings. Berlin, Heidelberg, Stuttgart: Springer Verlag, p. 165–176 — This is the publication to quote when using the consensus network approach implemented inward SplitsTree.
Quote for SplitsTree4 — Huson DH, Bryant D. 2006. Application of phylogenetic networks inward evolutionary studies. Molecular Biology in addition to Evolution 23:254–267.
Khanum R, Surveswaran S, Meve U, Liede-Schumann S. 2016. Cynanchum (Apocynaceae: Asclepiadoideae): H5N1 pantropical Asclepiadoid genus revisited. Taxon 65:467–486. [PDF]
Schliep K, Potts AJ, Morrison DA, Grimm GW. 2017. Intertwining phylogenetic trees in addition to networks. Methods inward Ecology in addition to Evolution DOI:10.1111/2041-210X.12760.
Quote for bootstop criterion — Pattengale ND, Masoud A, Bininda-Emonds ORP, Moret BME, Stamatakis A. 2009. How many bootstrap replicates are necessary? In: Batzoglou S, ed. RECOMB 2009. Berlin, Heidelberg: Springer-Verlag, p. 184–200.
Quote for RAxML 8 — Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis in addition to post-analysis of large phylogenies. Bioinformatics 30:1312–1313.
Stamatakis A. 2015. Using RAxML to infer phylogenies. Current Protocols inward Bioinformatics 51:6.14.11–16.14.14. doi: 10.1002/0471250953.bi0614s51 — Useful for freshlings in addition to experienced tree-inferrers.

0 Response to "How Did 1 Create It – A Brusk Direct To A Overnice Graph"