UnJin Lee

National Science Foundation Postdoctoral Fellow in Biology

Rockefeller University

PhD, MA University of Chicago, 2021

Committee on Genetics, Genomics, and Systems Biology

Publications

The 3-Dimensional Genome Drives the Evolution of Asymmetric Gene Duplicates via Enhancer Capture-Divergencemore_vert
The 3-Dimensional Genome Drives the Evolution of Asymmetric Gene Duplicates via Enhancer Capture-Divergenceclose

Previous evolutionary models of duplicate gene evolution have overlooked the pivotal role of genome architecture. Here, we show that proximity-based regulatory recruitment of distally duplicated genes (enhancer capture) is an efficient mechanism for modulating tissue-specific production of pre-existing proteins.

Comparative Single Cell Analysis of Transcriptional Bursting Reveals the Role of Genome Organization on de novo Transcript Originationmore_vert
Comparative Single Cell Analysis of Transcriptional Bursting Reveals the Role of Genome Organization on de novo Transcript Originationclose

Spermatogenesis is a key developmental process underlying the origination of newly evolved genes. However, rapid cell type-specific transcriptomic divergence of the Drosophila germline has posed a significant technical barrier for comparative single-cell RNA-sequencing (scRNA-Seq) studies. By quantifying a surprisingly strong correlation between species-and cell type-specific divergence in three closely related Drosophila species, we apply a simple statistical procedure to identify a core set of 198 genes that are highly predictive of cell type identity while remaining robust to species-specific differences that span over 25-30 million years of evolution.

A Synergistic, Cultivator Model of De Novo Gene Origination more_vert
A Synergistic, Cultivator Model of De Novo Gene Originationclose

The origin and fixation of evolutionarily young genes is a fundamental question in evolutionary biology. However, understanding the origins of newly evolved genes arising de novo from noncoding genomic sequences is challenging. This is partly due to the low likelihood that several neutral or nearly neutral mutations fix prior to the appearance of an important novel molecular function. This issue is particularly exacerbated in large effective population sizes where the effect of drift is small. To address this problem, we propose a regulation-focused, cultivator model for de novo gene evolution.

Evolution and maintenance of phenotypic plasticitymore_vert
Evolution and maintenance of phenotypic plasticityclose

We introduce a novel framework for exploring the evolutionary consequences of phenotypic plasticity (adaptive and non-adaptive) integrating both genic and epigenetic effects on phenotype via stochastic differential equations and in-silico selection.

Topological evolution of coexpression networks by new gene integration maintains the hierarchical and modular structures in human ancestors more_vert
Topological evolution of coexpression networks by new gene integration maintains the hierarchical and modular structures in human ancestorsclose

We analyze the global structure and evolution of human gene coexpression networks driven by new gene integration. When the Pearson correlation coefficient is greater than or equal to 0.5, we find that the coexpression network consists of 334 small components and one "giant" connected subnet comprising of 6317 interacting genes. This network shows the properties of power-law degree distribution and small-world. The average clustering coefficient of younger genes is larger than that of the elderly genes (0.6685 vs. 0.5762). Particularly, we find that the younger genes with a larger degree also show a property of hierarchical architecture. The younger genes play an important role in the overall pivotability of the network and this network contains few redundant duplicate genes. Moreover, we find that gene duplication and orphan genes are two dominant evolutionary forces in shaping this network. Both the duplicate genes and orphan genes develop new links through a "rich-gets-richer" mechanism. With the gradual integration of new genes into the ancestral network, most of the topological structure features of the network would gradually increase. However, the exponent of degree distribution and modularity coefficient of the whole network do not change significantly, which implies that the evolution of coexpression networks maintains the hierarchical and modular structures in human ancestors.

Genomic analyses of new genes and their phenotypic effects reveal rapid evolution of essential functions in Drosophila developmentmore_vert
Genomic analyses of new genes and their phenotypic effects reveal rapid evolution of essential functions in Drosophila developmentclose

It is a conventionally held dogma that the genetic basis underlying development is conserved in a long evolutionary time scale. Ample experiments based on mutational, biochemical, functional, and complementary knockdown/knockout approaches have revealed the unexpectedly important role of recently evolved new genes in the development of Drosophila. The recent progress in the genome-wide experimental testing of gene effects and improvements in the computational identification of new genes ( < 40 million years ago, Mya) open the door to investigate the evolution of gene essentiality with a phylogenetically high resolution

Prognostic and predictive breast cancer signature more_vert
Prognostic and predictive breast cancer signature close

Embodiments of the invention are directed to methods of determining the prognosis of a breast cancer patient by evaluating a specified set of genes. Specifically, methods may comprise calculating a prognosis score based on a particular algorithm. Also disclosed are compositions, kits and methods for treating cancer in a subject in need thereof are disclosed involving one or more upstream activators and/or downstream effectors of TET1.

A The Cyanobacterial Circadian Clock Follows Midday in Vivo and in Vitromore_vert
The Cyanobacterial Circadian Clock Follows Midday in Vivo and in Vitroclose

Here we report experimental platforms for driving the cyanobacterial circadian clock both in vivo and in vitro. We find that the phase of the circadian rhythm follows a simple scaling law in light-dark cycles, tracking midday across conditions with variable day length. The core biochemical oscillator comprised of the Kai proteins behaves similarly when driven by metabolic pulses in vitro, indicating that such dynamics are intrinsic to these proteins. We develop a general mathematical framework based on instantaneous transformation of the clock cycle by external cues, and it successfully predicts clock behavior under many cycling environments.

Geometric Structure and Geodesic in a Solvable Model of Nonequilibrium Processmore_vert
Geometric Structure and Geodesic in a Solvable Model of Nonequilibrium Processclose

We investigate the geometric structure of a nonequilibrium process and its geodesic solutions. By employing an exactly solvable model of a driven dissipative system (generalized nonautonomous Ornstein-Uhlenbeck process), we compute the time-dependent probability density functions (PDFs) and investigate the evolution of this system in a statistical metric space where the distance between two points (the so-called information length) quantifies the change in information along a trajectory of the PDFs.

Noise-Driven Phenotypic Heterogeneity with Finite Correlation Time in Clonal Populationsmore_vert
Noise-Driven Phenotypic Heterogeneity with Finite Correlation Time in Clonal Populationsclose

There has been increasing awareness in the wider biological community of the role of clonal phenotypic heterogeneity in playing key roles in phenomena such as cellular bet-hedging and decision making, as in the case of the phage-λ lysis/lysogeny and B. Subtilis competence/vegetative pathways. Here, we report on the effect of stochasticity in growth rate, cellular memory/intermittency, and its relation to phenotypic heterogeneity.

A Prognostic Gene Signature for Metastasis-Free Survival of Triple Negative Breast Cancer Patientsmore_vert
A Prognostic Gene Signature for Metastasis-Free Survival of Triple Negative Breast Cancer Patientsclose

The application of gene expression array technology to breast cancer has emphasized the heterogeneity of this disease and also provided new tools to classify breast cancers into subtypes based on gene expression patterns. Ideally each subtype would reflect distinct molecular characteristics corresponding to discrete cancer phenotypes. This information could be used to gain prognostic insight and, eventually, to predict response to therapy.

Achievements

Awards & Honors
  • NSF Postdoctoral Research Fellowship in Biology
  • SMBE Satellite Meeting on De Novo Gene Birth Best Presentation Award
  • SMBE Satellite Meeting on De Novo Gene Birth Travel Award
  • Ridgeway Endowment Support Award
  • Hinds Evolutionary Biology Graduate Student Research Award
  • NIH T32 Gene Regulation Training Grant Recipient
  • NSF GRFP Honorable Mention 2016
  • Odyssey Scholar
  • National Merit Scholar
Academic Service Activities
  • Rockefeller University Sustainability Committee Representative
  • Rockefeller University PDA Board Member
  • Rockefeller Inclusive Science Initiative Executive Board Member/Mental Health and Accessibility Coordinator
  • RockEDU SSRP Mentor
  • Genspace Bioinformatics Workshop Instructor
  • Genspace FIT Scholar Program Mentor
  • BSD Equipment Library Founder and Committee Member
  • BSD Travel Award Committee Chair
  • Co-Host Groks Science Show Podcast
  • Co-Host Groks Science Show Radio Hour WHPK 88.5FM
  • Genetics Science Connections at the Museum of Science and Industry
  • American Association for the Advancement of Science Member
Experimental Techniques, Expertise
  • Single Cell RNA Sequencing
  • Stochastic Differential Equations
  • Applied Numerical Optimization
  • Monte Carlo Methods
  • Gene Expression Analysis
  • Survival Analysis
  • Statistical Modeling
  • Chromosomal Confirmation Capture (4C)
  • Enhancer-Reporter Assays
  • Phylogenetic Analysis
  • Cloning
Presentations
  • 2024 New York Area Population Genetics Meeting (Selected Talk)
  • 2023 SMBE Satellite Meeting on De Novo Genes (Poster)
  • 2022 New York Area Population Genetics Meeting (Poster)
  • 2019 Midwest Population Genetics Meeting (Selected Talk)
  • 2019 Gordon Conference on Ecological and Evolutionary Genetics (Poster)
  • 2013 Undergraduate Symposium of the American Association of Cancer Researchers (Poster)
  • 2013 University of Chicago (Poster) Undergraduate Research Symposium
  • 2016 Aspen Center for Physics Winter Conference on Evolution, Populations, and Physics (Poster)
Software
Scientific Software
  • sigsquared (c.f. Lee and Frankenberger et al 2013/R/S4)
  • Deterministic Runge Kutta Solvers (1st-4th order/C++)
  • Stochastic Runge Kutta Solvers (Honeycutt with Finite Correlation Time/C++)
  • Gillespie Stochastic Simulation Algorithm (First Reaction method/C++)
  • Genetic Algorithm (C++)
  • Downhill Simplex using LASSO (Java)
Languages
  • R (S3/S4)
  • MATLAB
  • C++
  • Java
  • bash