Buu Truong

Logo

Hi, I am Buu Truong. I am currently a PhD student at Alkes Price lab at Harvard T.H. Chan School of Public Health and a computational biologist at Pradeep Natarajan lab at the Broad Institute of MIT and Harvard. Previously, I graduated as a Medical Doctor from Vietnam.

View GitHub Profile

Statistical Genetics enthusiast

About me

My primary research focuses on developing and applying statistical methods for exploration of genetic architecture and genomic studies such as genome-wide association study (GWAS), polygenic risk scores (PRS), gene-by-environment analysis, RNA-Seq, and single-cell RNA-Seq analysis. I am also interested in machine learning algorithms and causal inference methods such as Bayesian network construction, potential outcome model, and matching algorithms for multi-omics data.

Education

Harvard University – PhD student
• Advisor: Prof. Alkes L. Price and Prof. Limimg Liang
• Program in Genetic Epidemiology and Statistical Genetics

University of South Australia – Visiting Student / Researcher
• Advisor: Associate Professor S. Hong Lee, Associate Professor Thuc Duy Le
• Computational Biology and Statistical Genetics

Pham Ngoc Thach University of Medicine
• Doctor of Medicine (MD) - General Practitioner

Le Hong Phong High School for the Gifted
• Class of Computer Science

Professional Experience

Harvard University
PhD student - Mar, 2023 – Sep, 2028
Advisor: Prof. Alkes L. Price and Prof. Limimg Liang
• Developed a method to leverage polygenicity enrichment from GWAS integrating chromatin accessibility signal from scATAC-seq data to identify desease-relevant cell types.
• Built causal network of Mendelian Randomization to investigate the inflammation interactions using proteomic data.
• Re-evaluated the tagging power of HapMap3 variants and developed a new method to identify tagging SNPs using large-scale whole genome sequencing data and from linkage disequilibrium (LD) score.
• Built pipeline to process large scATAC-seq data with 1.2 million cells and 1.3 million peaks.

Broad Institute of MIT and Harvard – Program in Medical and Population Genetics (MPG)
Computational Biologist - Jan, 2022 – Present
Advisor: Dr. Pradeep Natarajan
• Investigated the utility of PRS and its interaction with clinical risk factors for incident coronary artery disease.
• Improved polygenic risk score for non-European ancestry in All of Us and Genes and Health data using combinations of trait-specific and cross-trait PRS. Further improvement with the integrations of functional annotations.
• Explored the novel loci of hypertensive disorder of pregnancy with a meta-analyzed GWAS across 12 biobanks.
• Colocalized between GWAS of hypertensive disorders of pregnancy and cardiovascular-related traits.
• Investigated differential signatures of metabolomic profile between cardiovascular disease with and without preexisting type 2 diabetes in the UK Biobank.
• Investigated the association of clonal hematopoiesis from whole exome sequencing data and incident lung cancer in Mass General Brigham Biobank.

South Australian Health and Medical Research Institute
Student researcher in Statistical Genetics Aug, 2018 – March, 2023
Advisor: Associate Professor S. Hong Lee
• Developed a method to investigate genetic causality between complex traits motivated by gene-by-environment interaction analysis which helps to relax the strict assumptions and reversed causation from Mendelian Randomization.
• Simulated phenotypes based on genotype data in the UK Biobank for gene-by-environment interaction analysis from heritability and covariance structure between traits.
• Developed a novel method to estimate the genetic correlation between diverse ancestries with individual genotypes.
• Used relatedness information to improve accuracy of PRS through 44-fold smaller sample size of close relatives compared to unrelated individuals. This work was published in Nature Communications (https://doi.org/10.1038/s41467-020-16829-x).

University of South Australia
Student researcher in Computational Biology Jan, 2017 – Dec, 2021
Advisor: Associate Professor Thuc Duy Le
• Used causal inference approaches to identified gene signatures for breast cancer subtypes and investigate treatment response from bulk and single cell RNA-seq data.
• Built Bayesian network with RNA-seq data to understand causal regulatory mechanism of micro-RNA and mRNA in breast cancer.
• Built algorithms to select genes to predict cell position in Drosophila embryo using single-cell RNA-seq.

Vingroup Big Data Institute
Research Scientist in Statistical Genetics - Mar, 2021 – Dec, 2021
Advisor: Dr. Quan Nguyen (University of Queensland) and Dr. Nam Vo (Vingroup Big Data Institute).
• Simulated phenotype for different ancestries given genotypic data, heritability, and genetic covariance.
• Built PRS with different methods for complex trait and diseases for the Vietnamese population.

Allelica
Visiting Scientist in Statistical Genetics Sep, 2021 – Dec, 2021
Founders: Giordano Bottà, Paolo Di Domenico and George Busby.
• Implemented fine-mapping to identify causal variants for type 2 diabetes.
• Improved PRS accuracy for type 2 diabetes leveraging functionally-informed fine-mapping.

MTI Technology - Lead Data Scientist - Jan, 2020 – Nov, 2021
Project: Predict success rate for pregnancy of in vitro fertilization.
• Used machine learning methods to estimate the success rate of in vitro fertilization from electronic health record data.

Oxford University Clinical Research Unit (OUCRU) - Mathematical Modelling
Intern - Jun, 2016 – Jun, 2017
Advisor: Assistant Professor Hannah Clapham.
• Investigated epidemiology of hepatitis B virus (HBV) to understand the transmission of HBV.
• Applied differential equation SIR model for spread of HBV to estimate the optimized impact of vaccination coverage.

SELECTED MANUSCRIPTS

* Equal contribution;

  1. Michael C. Honigberg, Buu Truong*, Satoshi Koyama, Aniruddh P. Patel, So Mi Jemma Cho, Mark Trinder, Sarah M. Urbut, Snehal Patil, Sebastian Zöllner, Thi Ha Vy, Girish Nadkarni, Ron Do, Triin Laisk, Estonian Biobank Research Team, David van Heel, Pradeep Natarajan. “Polygenic prediction of preeclampsia and gestational hypertension”. Nature Medicine. doi: 10.1038/s41591-023-02374-9 (2023)
  2. Buu Truong, Leland Hull, Yunfeng Ruan, Qinqin Huang, Hilary Martin, David van Heel, Alicia R. Martin, S. Hong Lee, Pradeep Natarajan. “Improving polygenic risk scores by integrating trait-specific and cross-trait scores” (accepted on Cell Genomics) (2024)
  3. Buu Truong, Yunfeng Ruan, Sara Haidermota, Aniruddh Patel, Ida Surakka, Whitney Hornsby, Satoshi Koyama, S. Hong Lee, Pradeep Natarajan. 2022. “Modification of coronary artery disease clinical risk factors by coronary artery disease polygenic risk score”. (in-press on Med). (2024)
  4. Art Schuermans, Buu Truong, Maddalena Ardissino, Pradeep Natarajan, Michael C. Honigberg. 2024. Genetic Associations of Circulating Cardiovascular Proteins With Gestational Hypertension and Preeclampsia. JAMA Cardiology. (doi:10.1001/jamacardio.2023.4994) (2024)
  5. Buu Truong, Xuan Zhou, Jisu Shin, Jiuyong Li, Julius H. J. van der Werf, Thuc Duy Le, and S. Hong Lee. 2020. “Efficient Polygenic Risk Scores For Biobank Scale Data By Exploiting Phenotypes From Inferred Relatives”. Nature Communications 11 (1). doi:10.1038/s41467-020-16829-x. (2020)
  6. Ruiyi Tian, R., Wiley B., Buu Truong*, …, Pradeep Natarajan, Kelly L. Bolton, Yin Cao. 2022. “Clonal Hematopoiesis and Risk of Incident Lung Cancer”. Journal of Clinical Oncology DOI: 10.1200/JCO.22.00857. (2022)
  7. Momin Md. Moksedul, Jisu Shin, Soohyun Lee, Buu Truong, Beben Benyamin, and S. Hong Lee. 2023. “A Novel Method For An Unbiased Estimate Of Cross-Ancestry Genetic Correlation Using Individual-Level Data”. Nature Communications. https://doi.org/10.1038/s41467-023-36281-x. (2023)
  8. Pham Duy, Buu Truong, Khai Tran, Guiyan Ni, Dat Nguyen, Trang T H Tran, Mai H Tran, Duong Nguyen Thuy, Nam S Vo, and Quan Nguyen. 2022. “Assessing Polygenic Risk Score Models For Applications In Populations With Under-Represented Genomics Data: An Example Of Vietnam”. Briefings In Bioinformatics. doi:10.1093/bib/bbac459. (2022)
  9. Vu V H Pham, Xiaomei Li, Buu Truong*, Thin Nguyen, Lin Liu, Jiuyong Li, and Thuc D Le. 2021. “The Winning Methods For Predicting Cellular Position In The DREAM Single-Cell Transcriptomics Challenge”. Briefings In Bioinformatics 22 (3). doi:10.1093/bib/bbaa181. (2021)
  10. Jovan Tanevski, Thin Nguyen, Buu Truong*, Nikos Karaiskos, Mehmet Eren Ahsen, Xinyu Zhang, and Chang Shu et al. 2020. “Gene Selection For Optimal Prediction Of Cell Position In Tissues From Single-Cell Transcriptomics Data”. Life Science Alliance 3 (11): e202000867. doi:10.26508/lsa.202000867. (2020)

MANUSCRIPTS

* Equal contribution; + Correspondence

  1. Samuel Kim+, Buu Truong+, Karthik Jagadeesh , Kushal Dey, Amber Shen, Soumya Raychaudhuri , Manolis Kellis, Alkes L. Price+ (2024). Leveraging single-cell ATAC-seq and RNA-seq to identify disease-critical fetal and adult brain cell types. Nat Commun 15, 563 (2024). https://doi.org/10.1038/s41467-024-44742-0 (2024)
  2. Aeron Small, Giorgio Melloni , Frederick Kamanu , Brian Bergmark , … Stephen Wiviott , Buu Truong, …, Pradeep Natarajan , Nicholas Marston. Integration of a Novel Polygenic Risk Score with Established Clinical Risk Factors for Risk Prediction of Aortic Stenosis. (in-press on JAMA Cardiology) (2024)
  3. Nina Kathiresan, So Mi Jemma Cho, Romit Bhattacharya, Buu Truong, Whitney Hornsby, Pradeep Natarajan. (2023). “Representation of Race and Ethnicity in a Contemporary US Health Cohort: The All of Us Research Program”. JAMA Cardiology. 2023;8(9):859-864. doi:10.1001/jamacardio.2023.2411 (2023)
  4. Qi Yan, Nathan Blue, Buu Truong, Yu Zhang, Rafael Guerrero, Nianjun Liu, Michael C Honigberg, Samuel Parry, Rebecca B McNeil, Brian M Mercer, William A Grobman, Robert Silver, Uma M Reddy, Wapner J Ronald, David M Haas. (2023). “Genetic Associations with Dynamic Placental Proteins Identify Causal Biomarkers for Hypertension in Pregnancy”. (under-review in Nature Communications). https://doi.org/10.1101/2023.05.25.23290460 (2023)
  5. Sarah M Urbut, Satoshi Koyama, Whitney Hornsby, Rohan Bhukar, Sumeet Kheterpal, Buu Truong, Margaret S Selvaraj, Benjamin Neale, Christopher J O’Donnell, Gina M Peloso, Pradeep Natarajan (2023). “Bayesian Multivariate Genetic Analysis Improves Translational Insights”. iScience. https://doi.org/10.1016/j.isci.2023.107854 (2023)
  6. Mesbah Uddin, Zhi Yu, Martin Jinye Zhang, Buu Truong, …, Pradeep Natarajan. (2022). “Germline genomic and phenomic landscape of clonal hematopoiesis in 648,992 individuals”. (under review in Nature Genetics) (2022)
  7. Neshat, Mehdi, Soohyun Lee, Md. Moksedul Momin, Buu Truong, Julius H. J. van der Werf, and S. Hong Lee. 2022. “A Novel Hyper-Parameter Can Increase The Prediction Accuracy In A Single-Step Genetic Evaluation”. Frontiers in Genetics. doi:10.1101/2022.07.03.498620. (2022)
  8. Vu Viet Hoang Pham, Muktar Ahmed, Xuan Zhou, Md Moksedul Momin, Soohyun Lee, Buu Truong, Thuc Le and Sang Hong Lee. “The effects of genome and metabolome on complex traits and phenotypic prediction” (under review in Scientific Reports). (2022)
  9. Xiaomei Li, Buu Truong, Taosheng Xu, Lin Liu, Jiuyong Li, and Thuc D. Le. 2021. “Uncovering The Roles Of Micrornas/Lncrnas In Characterising Breast Cancer Subtypes And Prognosis”. BMC Bioinformatics 22 (1). doi:10.1186/s12859-021-04215-3. (2021)
  10. Adi Tarca, Bálint Ármin Pataki, Roberto Romero, Marina Sirota, Yuanfang Guan, Rintu Kutum, …, Buu Truong, and Nardhy Gomez-Lopez et al. 2021. “Crowdsourcing Assessment Of Maternal Blood Multi-Omics For Predicting Gestational Age And Preterm Birth”. Cell Reports Medicine 2 (6): 100323. doi:10.1016/j.xcrm.2021.100323. (2021)
  11. Thin Nguyen, Samuel C. Lee, Thomas P. Quinn, Buu Truong, Xiaomei Li, Truyen Tran, Svetha Venkatesh, and Thuc Duy Le. 2021. “PAN: Personalized Annotation-Based Networks For The Prediction Of Breast Cancer Relapse”. IEEE/ACM Transactions On Computational Biology And Bioinformatics 18 (6): 2841-2847. doi:10.1109/tcbb.2021.3076422. (2021)
  12. Junpeng Zhang, Thin Nguyen, Buu Truong, Lin Liu, Jiuyong Li, and Thuc Duy Le. 2020. “Computational Methods For Predicting Autism Spectrum Disorder From Gene Expression Data”. Advanced Data Mining And Applications, 395-409. doi:10.1007/978-3-030-65390-3_31. (2020)
  13. Vu VH Pham, Junpeng Zhang, Lin Liu, Buu Truong, Taosheng Xu, Trung T. Nguyen, Jiuyong Li, and Thuc D. Le. 2019. “Identifying Mirna-Mrna Regulatory Relationships In Breast Cancer With Invariant Causal Prediction”. BMC Bioinformatics 20 (1). doi:10.1186/s12859-019-2668-x. (2019)
  14. Junpeng Zhang, Vu Viet Hoang Pham, Lin Liu, Taosheng Xu, Buu Truong, Jiuyong Li, Nini Rao, and Thuc Duy Le. 2019. “Identifying Mirna Synergism Using Multiple-Intervention Causal Inference”. BMC Bioinformatics 20 (S23). doi:10.1186/s12859-019-3215-5. (2019)
  15. Thuc Duy Le, Junpeng Zhang., Liu, Lin, Buu Truong, Shu Hu, Taosheng Xu and Jiuyong Li. (2017). “ Identifying microRNA targets in epithelial-mesenchymal transition using joint-intervention causal inference”. Proceedings of the 8th International Conference on Computational Systems-Biology and Bioinformatics - CSBio ‘17. https://doi.org/10.1145/3156346.3156353. (2017)

CONFERENCES

* Equal contribution. +++ Platform talk.

  1. Buu Truong, Zihan Sun, Zihan Wang, Vishal Sarsani, Xikun Han, Jie Hu, JoAnn E. Manson, Frank B. Hu, Liming Liang, Jun Li. The genetic architecture of inflammatory markers characterizing obesity- and adiposity-induced inflammation. American Society of Human Genetics. (2023)
  2. Michael C. Honigberg, Buu Truong*, Satoshi Koyama, Aniruddh P. Patel, So Mi Jemma Cho, Mark Trinder, Sarah M. Urbut, Snehal Patil, Sebastian Zöllner, Thi Ha Vy, Girish Nadkarni, Ron Do, Triin Laisk, Estonian Biobank Research Team, David van Heel, Pradeep Natarajan. “Genome-wide meta-analysis identifies novel risk variants and enables polygenic prediction of preeclampsia and gestational hypertension”. American Society of Human Genetics+++. (2022)
  3. Buu Truong, Yunfeng Ruan, Sara Haidermota, Aniruddh Patel, Ida Surakka, Whitney Hornsby, Satoshi Koyama, S. Hong Lee, Pradeep Natarajan. “Modification of coronary artery disease clinical risk factors by coronary artery disease polygenic risk score”. American Society of Human Genetics. (2022)
  4. Leland Hull, Buu Truong*, Pradeep Natarajan. “Performance of externally developed polygenic risk scores in the All of Us Research Program Database”. American Society of Human Genetics. (2022)
  5. Aeron Small, Giorgio Melloni, Frederick Kamanu, Buu Truong, …, Pradeep Natarajan, Nicholas Marston. “Development of a Novel Polygenic Risk Score to Predict Risk of Aortic Stenosis Events”. American Heart Association. (2022)
  6. Thomas Gilliland, Jiwoo Lee, Yunfeng Ruan, Buu Truong, Sara Haidermota, Kim Lannery, Megan Wong, Satoshi Koyama, Pradeep Natarajan. (2022). Metabolomic profiles differentiate arterial and venous polygenic risk. American Heart Association. (2022)
  7. Xiaomei Li, Buu Truong*, Taosheng Xu, Lin Liu, Jiuyong Li, and Thuc D. Le. Identifying cell locations from subsets of marker genes - A winning method for the DREAM 2019 Single Cell Transcriptomics Challenge. OZ Single Cells+++. (2019)
  8. Buu Truong, Xuan Zhou, Jisu Shin, Jiuyong Li, Julius H. J. van der Werf, Thuc Duy Le, and S. Hong Lee. Efficient Polygenic Risk Scores For Biobank Scale Data By Exploiting Phenotypes From Inferred Relatives. Australia Polygenic Risk Symposium+++. (2019)
  9. Junpeng Zhang, Vu Viet Hoang Pham, Lin Liu, Taosheng Xu, Buu Truong, Jiuyong Li, Nini Rao, and Thuc Duy Le. Identifying miRNA synergism using multiple-intervention causal inference. International Conference on Genome Informatics & Australian Bioinformatics and Computational Biology Society+++. (2019)

PROFESSIONAL MEMBERSHIPS

  1. Polygenic Risk Methods in Diverse Populations (PRIMED) Consortium – Method development subcommittee.
  2. Biobank Rare Variant Analysis (BRaVa) – Method development subcommittee.