A Pedagogical Approach to Create and Assess Domain-Specific Data Science Learning Materials in the Biomedical Sciences


TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


This dissertation explores creating a set of domain-specific learning materials for the biomedical sciences to meet the educational gap in biomedical informatics, while also meeting the call for statisticians advocating for process improvements in other disciplines. Data science educational materials are plenty enough to become a commodity. This provides the opportunity to create domain-specific learning materials to better motivate learning using real-world examples while also capturing intricacies of working with data in a specific domain. This dissertation shows how the use of persona methodologies can be combined with a backwards design approach of creating domain-specific learning materials.

The work is divided into three (3) major steps: (1) create and validate a learner self-assessment survey that can identify learner personas by clustering. (2) combine the information from persona methodology with a backwards design approach using formative and summative assessments to curate, plan, and assess domain-specific data science workshop materials for short term and long term efficacy. (3) pilot and identify at how to manage real-time feedback within a data coding teaching session to drive better learner motivation and engagement.

The key findings from this dissertation suggests using a structured framework to plan and curate learning materials is an effective way to identify key concepts in data science. However, just creating and teaching learning materials is not enough for long-term retention of knowledge. More effort for long-term lesson maintenance and long-term strategies for practice will help retain the concepts learned from live instruction. Finally, it is essential that we are careful and purposeful in our content creation as to not overwhelm learners and to integrate their needs into the materials as a primary focus. Overall, this contributes to the growing need for data science education in the biomedical sciences to train future clinicians use and work with data and improve patient outcomes.



data science, data science education, pedagogy, medical education, biomedical sciences