Estimating Heritabilities and Breeding Values From Censored Phenotypes Using a Data Augmentation Approach

Front Genet. 2022 Jul 25:13:867152. doi: 10.3389/fgene.2022.867152. eCollection 2022.

Abstract

Time-dependent traits are often subject to censorship, where instead of precise phenotypes, only a lower and/or upper bound can be established for some of the individuals. Censorship reduces the precision of phenotypes but can represent compromise between measurement cost and animal ethics considerations. This compromise is particularly relevant for genetic evaluation because phenotyping initiatives often involve thousands of individuals. This research aimed to: 1) demonstrate a data augmentation approach for analysing censored phenotypes, and 2) quantify the implications of phenotype censorship on estimation of heritabilities and predictions of breeding values. First, we simulated uncensored phenotypes, representing fine-scale "age at puberty" for each individual in a population of some 5,000 animals across 50 herds. Analysis of these uncensored phenotypes provided a gold-standard control. We then produced seven "test" phenotypes by superimposing varying degrees of left, interval, and/or right censorship, as if herds were measured on only one, two or three occasions, with a binary measure categorized for animals at each visit (either pre or post pubertal). We demonstrated that our estimates of heritabilities and predictions of breeding values obtained using a data augmentation approach were remarkably robust to phenotype censorship. Our results have important practical implications for measuring time-dependent traits for genetic evaluation. More specifically, we suggest that data collection can be designed with relatively infrequent repeated measures, thereby reducing costs and increasing feasibility across large numbers of animals.

Keywords: MCMC; baysian; breeding; censored; data augmentation; gibbs sampling.