Study Provides Framework for 1 Billion Years of Green Plant Evolution

10/22/2019 Alan Flurry (aflurry@uga.edu), U. Georgia; Katie Willis (kewillis@ualberta.ca), U. Alberta

Prof. Tandy Warnow helps an international consortium construct a large evolutionary tree of green plants, using advanced algorithms.

Written by Alan Flurry (aflurry@uga.edu), U. Georgia; Katie Willis (kewillis@ualberta.ca), U. Alberta

Green alga&nbsp;<em>Lacunastrum gracillimum</em>, female cones of gymnosperm,&nbsp;<em>Gnetum gnemon</em>, and cherry tree flower,&nbsp;<em>Prunus domestica.&nbsp;</em>Photo credits: Michael Melkonian and Walter S. Judd.
Green alga Lacunastrum gracillimum, female cones of gymnosperm, Gnetum gnemon, and cherry tree flower, Prunus domestica. Photo credits: Michael Melkonian and Walter S. Judd.

Gene sequences for more than 1100 plant species have been released by an international consortium of nearly 200 plant scientists, the culmination of a nine-year research project.

The One Thousand Plant Transcriptomes Initiative (1KP) is a global collaboration to examine the diversification of plant species, genes and genomes across the more than one-billion-year history of green plants dating back to the ancestors of flowering plants and green algae.

“In the tree of life, everything is interrelated,” said Gane Ka-Shu Wong, lead investigator and professor in the University of Alberta’s Faculty of Science and Faculty of Medicine & Dentistry. “And if we want to understand how the tree of life works, we need to examine the relationships between species. That’s where genetic sequencing comes in.”

The findings, published today in Nature, reveal the timing of whole genome duplications and the origins, expansions and contractions of gene families contributing to fundamental genetic innovations enabling the evolution of green algae, mosses, ferns, conifer trees, flowering plants and all other green plant lineages. The history of how and when plants secured the ability to grow tall, and make seeds, flowers and fruits provides a framework for understanding plant diversity around the planet including annual crops and long-lived forest tree species.

“Our inferred relationships among living plant species inform us that over the billion years since an ancestral green algal species split into two separate evolutionary lineages, one including flowering plants, land plants and related algal groups and the other comprising a diverse array of green algae, plant evolution has been punctuated with innovations and periods of rapid diversification” said James Leebens-Mack, professor of plant biology in the University of Georgia Franklin College of Arts and Sciences and co-corresponding author on the study. “In order to link what we know about gene and genome evolution to a growing understanding of gene function in flowering plant, moss and algal organisms, we needed to generate new data to better reflect gene diversity among all green plant lineages.”

The study inspired a community effort to gather and sequence diverse plant lineages derived from terrestrial and aquatic habitats on a global scale. Over 100 taxonomic specialists contributed material from field and living collections that include the Central Collection of Algal Cultures, Royal Botanic Gardens, Kew,  Royal Botanic Garden Edinburgh, Atlanta Botanical Garden, New York Botanical Garden, Fairylake Botanical Garden, Shenzhen, The Florida Museum of Natural History, Duke University, University of British Columbia Botanical Garden and The University of Alberta. By sequencing and analyzing genes from a broad sampling of plant species, researchers are better able to reconstruct gene content in the ancestors of all crops and model plant species, and gain a more complete picture of the gene and genome duplications that enabled evolutionary innovations.

Nearly a decade ago, Wong organized private funding through the Somekh Family Foundation as well as support from the Government of Alberta and a sequencing commitment from BGI in Shenzhen, China, to launch 1KP. Once the project was operational, additional resources came from other ongoing projects, including iPlant (now CyVerse) funded by the U.S. National Science Foundation.

The massive scope of the project demanded development and refinement of new computational tools for sequence assembly and phylogenetic analysis.

“New algorithms were developed by software engineers at BGI to assemble the massive volume of gene sequence data generated for this project,” explained Wong.

Founder Professor of Computer Science Tandy Warnow, of the University of Illinois at Urbana-Champaign, and Siavash Mirarab, assistant professor of electrical and computer engineering at the University of California San Diego, developed new algorithms for inferring evolutionary relationships from hundreds of gene sequences for over one thousand species, addressing substantial heterogeneity in evolutionary histories across the genomes.

The timing of 244 whole genome duplications across the green plant tree of life was one of the interrelated research foci of the project.

“Perhaps the biggest surprise of our analyses was the near absence of whole genome duplications in the algae,” said Mike Barker, associate professor of ecology and evolutionary biology at the University of Arizona. “Building on nearly 20 years of research on plant genomes, we found that the average flowering plant genome has nearly 4 rounds of ancestral genome duplication dating as far back as the common ancestor of all seed plants more than 300 million years ago. We also find multiple rounds of genome duplication in fern lineages, but there is little evidence of genome doubling in algal lineages.”

In addition to genome duplications, the expansion of key gene families has contributed to the evolution of multicellularity and complexity in green plants.

“Gene family expansions through duplication events catalyzed diversification of plant form and function across the green tree of life,” said co-author Marcel Quint, professor of crop physiology, at Halle University, Germany. “Such expansions unleashed during terrestrialization or even before set the stage for evolutionary innovations including the origin of the seed and later the origin of the flower.”

“The view of evolutionary relationships provided by 1KP has led to new hypotheses about the origins of key structures and processes in green plants,” said coauthor Pam Soltis, of the Florida Museum of Natural History, University of Florida.


The paper, “One Thousand Plant Transcriptomes and Phylogenomics of Green Plants,” was published in Nature (doi: 10.1038/s41586-019-1693-2). Sequences, sequence alignments and tree data are available through the CyVerse Data Commons.

Feature photo caption: Green alga Lacunastrum gracillimum, female cones of gymnosperm, Gnetum gnemon, and cherry tree flower, Prunus domestica. Photo credits: Michael Melkonian and Walter S. Judd.

Tandy Warnow
Tandy Warnow

Warnow Helps International Consortium Construct a Large Evolutionary Tree of Green Plants, Using Advanced Algorithms

Species tree estimation that can address genome-scale data is challenging because of heterogeneity across the genome, and it is particularly challenging for large datasets, such as the 1KP dataset with more than 1000 species. 

To enable highly accurate species trees for the 1KP datasets, Illinois Computer Science Professor Tandy Warnow, in collaboration with her former student Siavash Mirarab (now an assistant professor at UCSD), developed the ASTRAL-II method, which is now one of the leading methods for species tree estimation world-wide.

 "Working with the 1KP consortium over the last many years has been one of the most stimulating experiences in my research life. Not only did it lead to ASTRAL (and its improved version for this paper in Nature), but also methods for large-scale multiple sequence alignment (e.g., PASTA and UPP) that are also in wide use,” said Warnow.

“My continued involvement in the 1KP consortium has led me to develop new approaches to scale methods to large datasets, such as TreeMerge and Guide Tree Merger (with my PhD students Erin Molloy and Vladimir Smirnov), which will be used in the next analyses we perform with the 1KP consortium.  Finally, all of this work has benefited from the use of the Blue Waters supercomputer at NCSA, and from the generous support of The Grainger Foundation."


Share this story

This story was published October 22, 2019.