This course provides a practical foundation for designing and evaluating high-quality multiple-choice assessments in medical education. Beginning with core principles of assessment purpose, reliability, and validity, learners will explore how these concepts inform meaningful item analysis. Through hands-on practice, participants will learn to calculate and interpret key item statistics — such as difficulty, discrimination, and distractor performance — and apply them to improve item quality and test effectiveness. The course concludes with a deeper dive into reliability metrics and how item analysis contributes to consistent, fair, and valid assessments. By the end, educators will be equipped to use item-level data to enhance the integrity and impact of their assessment practices.

Learn More About Your Course Instructor  

Francis O’Donnell is a senior psychometrician at NBME. She is part of the team that oversees the NBME Subject Exams, Self-Assessments, and Customized Assessment Services and oversees scoring and quality assurance for a selection of certification programs. She is passionate about how assessment can support learning throughout medical students’ education trajectory. Dr. O’Donnell has been involved in several initiatives related to communicating assessment results in ways that are clearer and more actionable, including score report redesigns and the launch of INSIGHTS, a student dashboard experience. She holds a Ph.D. in Research, Educational Measurement, and Psychometrics from the University of Massachusetts Amherst.  

Photo of Francis O'Donnell, Ph.D.

Learning Goals 

  • Understand the importance of intended purpose and use of an assessment
  • Recognize what it means for test scores to be reliable and valid
  • Learn how to calculate and interpret common item statistics
  • Learn how to interpret common reliability indices
  • How to use item analysis to improve assessments