Dr. Kevin Liu has just been notified to receive an NSF CAREER award

CAREER: Future phylogenies: novel computational frameworks for biomolecular sequence analysis involving complex evolutionary origins


Phylogenetics is the discipline that seeks to reconstruct and analyze the phylogeny, or evolutionary history, of a set of organisms. Phylogenetic reconstruction is primarily accomplished through computational analysis of DNA and other biomolecular sequence data. Phylogenies and the evolutionary insights that they provide are essential to biology and other disciplines, as well as many applications: important examples include reconstructing and studying the Tree of Life - the evolutionary history of all life on Earth, understanding human origins, infectious disease epidemiology and discovery of new solutions to future pandemics, crop improvement and agriculture, and forensic science. One of the two key ingredients needed for phylogenetic studies has seen a major leap forward thanks to advances in biomolecular sequencing technology: the scale of available biomolecular data is now among the largest in any domain and, in 2025, biomolecular data velocity and storage is projected to be comparable to or larger than Twitter and YouTube. On the other hand, recent "big data" phylogenetic studies point to a critical gap regarding the second of the two key ingredients in phylogenetics: existing computational algorithms need to move beyond their traditional simplifying assumptions about biomolecular sequence evolution. Two of the most important assumptions are: (1) "sequence-unaware" methods that ignore the inherently sequential nature of biomolecular sequences, and (2) the pre hoc assumption that evolutionary relationships have a simple branching structure and are "tree-like" - i.e., can be accurately described by a tree or other simple representation. New computational approaches and infrastructure are needed to move beyond these traditional assumptions and unlock the study of "future phylogenies" and next-generation phylogenetics. 

This project will therefore create new pathbreaking models and algorithms for complex phylogenetic analyses of biomolecular sequence data. The project also addresses gaps in STEM education through new curriculum development and a collaboration with the Impression 5 Science Center, a children’s science museum in mid-Michigan. Project impacts will be broadened through open-source software distributions and open data resources, new scientific discoveries enabled by the developed software and data infrastructure, scientific outreach activities, and student training and mentoring with a strong emphasis on diversity, equity, and inclusion (DEI).

(Date Posted: 2022-02-18)