Status, identity, and language: A study of issue discussions in GitHub

PLoS One. 2019 Jun 14;14(6):e0215059. doi: 10.1371/journal.pone.0215059. eCollection 2019.

Abstract

Successful open source software (OSS) projects comprise freely observable, task-oriented social networks with hundreds or thousands of participants and large amounts of (textual and technical) discussion. The sheer volume of interactions and participants makes it challenging for participants to find relevant tasks, discussions and people. Tagging (e.g., @AmySmith) is a socio-technical practice that enables more focused discussion. By tagging important and relevant people, discussions can be advanced more effectively. However, for all but a few insiders, it can be difficult to identify important and/or relevant people. In this paper we study tagging in OSS projects from a socio-linguistics perspective. First we argue that textual content per se reveals a great deal about the status and identity of who is speaking and who is being addressed. Next, we suggest that this phenomenon can be usefully modeled using modern deep-learning methods. Finally, we illustrate the value of these approaches with tools that could assist people to find the important and relevant people for a discussion.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases, Factual*
  • Deep Learning
  • Humans
  • Language*
  • Linguistics
  • Software
  • User-Computer Interface

Grants and funding

This material is based upon work supported by the National Science Foundation under Grant No. 1414172.