Using text data instead of SIC codes to tag innovative firms and classify industrial activities

PLoS One. 2022 Jun 30;17(6):e0270041. doi: 10.1371/journal.pone.0270041. eCollection 2022.

Abstract

The paper uses text mining and semantic algorithms to tag innovative firms and offer an alternative perspective to classify industrial activities. Instead of referring to firms' standard industrial classification codes, we gather information from companies' websites and corporate purposes, extract keywords and generate tags concerning firms' activities, specializations, and competences. Evidence is interesting because allows us to understand 'what firms do' in a more penetrating and updated way than referring to standard industrial classification codes. Moreover, through matching firms' keywords, we can explore the degree of closeness between the firms under observation, a measure by which researchers can derive industrial proximity. The analysis can provide policymakers with a detailed and comprehensive picture of the innovative trajectories underlying the industrial structure in a geographic area.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Industry*
  • Organizations*

Grants and funding

Fondirigenti G. Taliercio (Interprofessional fund for continuous training of managers promoted by Confindustria and Federmanager; www.fondirigenti.it) provided funding for the Strategic Initiative ‘Modelling of an Observatory to monitor the innovation ecosystem in the Chieti and Pescara area and mapping of managerial skills with a high innovation rate’, grant number: CIG 8188368708. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of the authors AM and CB are articulated in the ‘author contribution’ section.