English colour terms carry gender and valence biases: A corpus study using word embeddings

PLoS One. 2021 Jun 1;16(6):e0251559. doi: 10.1371/journal.pone.0251559. eCollection 2021.

Abstract

In Western societies, the stereotype prevails that pink is for girls and blue is for boys. A third possible gendered colour is red. While liked by women, it represents power, stereotypically a masculine characteristic. Empirical studies confirmed such gendered connotations when testing colour-emotion associations or colour preferences in males and females. Furthermore, empirical studies demonstrated that pink is a positive colour, blue is mainly a positive colour, and red is both a positive and a negative colour. Here, we assessed if the same valence and gender connotations appear in widely available written texts (Wikipedia and newswire articles). Using a word embedding method (GloVe), we extracted gender and valence biases for blue, pink, and red, as well as for the remaining basic colour terms from a large English-language corpus containing six billion words. We found and confirmed that pink was biased towards femininity and positivity, and blue was biased towards positivity. We found no strong gender bias for blue, and no strong gender or valence biases for red. For the remaining colour terms, we only found that green, white, and brown were positively biased. Our finding on pink shows that writers of widely available English texts use this colour term to convey femininity. This gendered communication reinforces the notion that results from research studies find their analogue in real word phenomena. Other findings were either consistent or inconsistent with results from research studies. We argue that widely available written texts have biases on their own, because they have been filtered according to context, time, and what is appropriate to be reported.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Color
  • Female
  • Humans
  • Language*
  • Male
  • Sexism*
  • Young Adult

Grants and funding

This research was made possible through a Doc.CH fellowship grant to DJ (P0LAP1_175055) and a project-funding grant to CM (100014_182138) from the Swiss National Science Foundation (http://www.snf.ch/en). AS was supported by the Engineering and Physical Sciences Research Council (EP/I028153/ and EP/L016656/1; https://epsrc.ukri.org/). NC was supported by the ERC grant ThinkBIG (https://cordis.europa.eu/project/id/339365).