logo for master program

MS - Biostatistics and Data Science

Main Content

Master of Science in Biostatistics and Data Science

The Master of Science (MS) in Biostatistics and Data Science degree is a two-year program that prepares graduates to extract, analyze, and translate vast amounts of data into actionable evidence and communicate findings to collaborators from other disciplines. This program synergizes competencies in statistics, computer science, and epidemiology, a crucial combination of skills for analyzing increasingly complex health-related data. Graduates will exhibit competence in:

  • Fundamental statistical theory
  • Common methods in biostatistics (regression, survival, and longitudinal analyses)
  • Statistical and computer programming languages (R, SAS, Stata, Python, SQL)
  • Machine learning methods, data visualization, and big data wrangling

Gain real-world experience

Through supervised consulting sessions and an internship, students will develop the technical and collaborative skills necessary to excel in clinical, academic, government, industrial, and population health work organizations.

Primary objective

To graduate leaders in Statistical Theory, Practical data analysis, Big data management and manipulation, and communication skills.

All biostatisticians and data scientists must master these competencies to support basic science, clinical, and population health studies.

Through supervised consulting sessions, an internship, and directed research, students will develop the technical and collaborative skills necessary to excel in clinical, academic, industrial, government, and population health work organizations. Students will have ample opportunities to work with high-quality data and reputable researchers from two epidemiologic studies supported by the National Institutes of Health. The Jackson Heart Study (JHS) is the largest ever single-site study of cardiovascular disease and its causes in African-Americans. The Atherosclerosis Risk in Communities study (ARIC) is designed to investigate the causes of atherosclerosis and its clinical outcomes, as well as the variation in cardiovascular risk factors and disease by race, gender, and location. 

Graduates of the program will be able to:
  • Efficiently collect, clean, organize, and appropriately analyze biomedical, clinical, and population health data;
  • Use standard statistical (R, SAS, and Stata) and computer (Python) programming languages to reproducibly explore and visualize data, fit models, conduct inference, and translate analysis results;
  • Conduct all facets of big data analysis, including the extraction, storage, manipulation, and analysis of massive genetic and bioinformatics datasets;
  • Convert information contained in databases and data warehouses into actionable findings using machine learning and other data science techniques;
  • Adhere to rigorous ethical and methodological standards when analyzing real-world data;
  • Collaborate with non-statisticians and communicate findings to the scientific and general community to improve health care and prevent disease.