A Bibliometric Analysis of 34,692 Publications on Thyroid Cancer by Machine Learning: How Much Has Been Done in the Past Three Decades?

Front Oncol. 2021 Oct 14:11:673733. doi: 10.3389/fonc.2021.673733. eCollection 2021.

Abstract

Introduction: Thyroid cancer (TC) is the most common neck malignancy. However, a large number of publications of TC have not been well summarized and discussed with more comprehensive methods. The purpose of this bibliometric study is to summarize scientific publications during the past three decades in the field of TC using a machine learning method.

Material and methods: Scientific publications focusing on TC from 1990 to 2020 were searched in PubMed using the MeSH term "thyroid neoplasms". Full associated data were downloaded in the format of PubMed, and extracted in the R platform. Latent Dirichlet allocation (LDA) was adopted to identify the research topics from the abstract of each publication using Python.

Results: A total of 34,692 publications related to TC from the last three decades were found and included in this study with an average of 1,119.1 publications per year. Clinical studies and experimental studies shared the most proportion of publications, while the proportion of clinical trials remained at a relatively small level (5.87% as the highest in 2004). Thyroidectomy was the lead MeSH term, followed by prognosis, differential diagnosis, and fine-needle biopsy. The LDA analyses showed the study topics were divided into four clusters, including treatment management, basic research, diagnosis research, epidemiology, and cancer risk. However, a relatively weak connection was shown between treatment managements and basic researches. Top 10 most cited publications in recent years particularly highlighted the applications of active surveillance in TC.

Conclusion: Thyroidectomy, differential diagnosis, genomic analysis, active surveillance are the most concerning topics in TC researches. Although the BRAF-targeted therapy is under development with promising results, there is still an urgent need for conversions from basic studies to clinical practice.

Keywords: bibliometrics; latent Dirichlet allocation; machine learning; natural language processing; thyroid cancer.