Inter-centre reliability in embryo grading across several IVF clinics is limited: implications for embryo selection

Reprod Biomed Online. 2022 Jan;44(1):39-48. doi: 10.1016/j.rbmo.2021.09.022. Epub 2021 Oct 6.

Abstract

Research question: What is the intra- and inter-centre reliability in embryo grading performed according to the Istanbul Consensus across several IVF clinics?

Design: Forty Day 3 embryos and 40 blastocysts were photographed on three focal planes. Senior and junior embryologists from 65 clinics were invited to grade them according to the Istanbul Consensus (Study Phase I). All participants then attended interactive training where a panel of experts graded the same embryos (Study Phase II). Finally, a second set of pictures was sent to both embryologists and experts for a blinded evaluation (Study Phase III). Intra-centre reliability was reported for Study Phase I as Cohen's kappa between senior and junior embryologists; inter-centre reliability was instead calculated between senior/junior embryologists and experts in Study Phase I versus III to outline improvements after training (i.e. upgrade of Cohen's kappa category according to Landis and Koch).

Results: Thirty-six embryologists from 18 centres participated (28% participation rate). The intra-centre reliability was (i) substantial (0.63) for blastomere symmetry (range -0.02 to 1.0), (ii) substantial (0.72) for fragmentation (range 0.29-1.0), (iii) substantial (0.66) for blastocyst expansion (range 0.19-1.0), (iv) moderate (0.59) for inner cell mass quality (range 0.07-0.92), (v) moderate (0.56) for trophectoderm quality (range 0.01-0.97). The inter-centre reliability showed an overall improvement from Study Phase I to III, from fair (0.21-0.4) to moderate (0.41-0.6) for all parameters under analysis, except for blastomere fragmentation among senior embryologists, which was already moderate before training.

Conclusions: Intra-centre reliability was generally moderate/substantial, while inter-centre reliability was just fair. The interactive training improved it to moderate, hence this workflow was deemed helpful. The establishment of external quality assessment services (e.g. UK NEQAS) and the avant-garde of artificial intelligence might further improve the reliability of this key practice for embryo selection.

Keywords: Blastocyst grading; Embryo grading; Inter-centre reliability; Intra-centre reliability; Morphological evaluation.

MeSH terms

  • Artificial Intelligence*
  • Blastocyst*
  • Embryo, Mammalian
  • Fertilization in Vitro
  • Humans
  • Reproducibility of Results