Scientific Abstract | Human Reproduction

Clinical Evaluation of an Image-Based Artificial Intelligence Model for Embryo Selection: A Double-Blinded Randomized Comparative Reader Study

July 3rd, 2024

Abstract

Study question What is the performance of an image-based artificial intelligence (AI) model for ranking blastocyst stage embryos compared to embryologists using traditional morphology?

Summary answer The AI was non-inferior to manual embryo selection. The AI showed significant improvement in clinical pregnancy when there was disagreement between AI and manual selection.

What is known already In previous work, we developed an image-based AI model that predicts the likelihood of clinical pregnancy by analyzing a single static image of a blastocyst captured prior to biopsy or freeze. This model was trained on data from over 8,000 single-blastocyst transfer cycles from multiple U.S. IVF clinics performed between 2014 to 2021.

Study design, size, duration We performed a retrospective, double-blinded, comparative reader study. The study included data from 438 single-blastocyst transfers from 10 different IVF clinics in U.S. that were not part of previous model development or testing. Using this data, a set of 1,257 virtual patient panels were created. Each virtual patient panel included between 2-5 embryos that were matched by age (18 - 29, 30 - 34, 35 - 37, ≥38), race (white, non-white, and unknown) and PGT-status.

Participants/materials, setting, methods A group of 5 embryologists (readers) with varying levels of experience were asked to select their top embryo for transfer for each virtual patient panel (control arm) based on morphology grades. The AI model was also used to select a top embryo for transfer from each patient panel (treatment arm). The clinical pregnancy rates of the top-selected embryos were calculated and compared.

Main results and the role of chance There was disagreement on the top pick embryo amongst the five embryologists 34.6% of the time, which increased to 43.9% when there were 3 or more embryos to choose from, supporting the need for a tool to standardize this decision. The clinical pregnancy rate of the control arm (average of embryologist readers) was 61.0% (individual rates of 58.9%, 59.6%, 61.5%, 61.6%, and 63.3%), and the clinical pregnancy rate of the treatment arm (AI model) was 62.3% (demonstrating non-inferiority with p<.001). The pregnancy rate of random embryo selection was 53.2%. All 5 of the embryologist readers agreed on the top-pick embryo 65% of the time. When all 5 readers agreed, the AI model disagreed with that consensus 31% of the time, and in these cases the AI model pregnancy rate was significantly higher by 8.6% (AI: 63.1%, Embryologist Consensus: 54.5% (p < 0.05)).

Limitations, reasons for caution While data for this study was collected prospectively, the analysis was done retrospectively. Furthermore, readers were provided morphology grades from retrospective data rather than grading the embryos themselves. However, it is common for an embryologist to make an embryo selection based on morphology grades previously assigned by a different embryologist.

Wider implications of the findings The AI model was able to select the top embryo for transfer with performance comparable to experienced embryologists. Such a model could allow for automated and objective embryo selection using a single static image of blastocyst stage embryos.

DOI: https://doi.org/10.1093/humrep/deae108.541

Embryo Grading

ESHRE