A Study of Multi-Task and Region-Wise Deep Learning for Food Ingredient Recognition

IEEE Trans Image Process. 2021:30:1514-1526. doi: 10.1109/TIP.2020.3045639. Epub 2020 Dec 31.

Abstract

Food recognition has captured numerous research attention for its importance for health-related applications. The existing approaches mostly focus on the categorization of food according to dish names, while ignoring the underlying ingredient composition. In reality, two dishes with the same name do not necessarily share the exact list of ingredients. Therefore, the dishes under the same food category are not mandatorily equal in nutrition content. Nevertheless, due to limited datasets available with ingredient labels, the problem of ingredient recognition is often overlooked. Furthermore, as the number of ingredients is expected to be much less than the number of food categories, ingredient recognition is more tractable in the real-world scenario. This paper provides an insightful analysis of three compelling issues in ingredient recognition. These issues involve recognition in either image-level or region level, pooling in either single or multiple image scales, learning in either single or multi-task manner. The analysis is conducted on a large food dataset, Vireo Food-251, contributed by this paper. The dataset is composed of 169,673 images with 251 popular Chinese food and 406 ingredients. The dataset includes adequate challenges in scale and complexity to reveal the limit of the current approaches in ingredient recognition.

MeSH terms

  • China
  • Cooking
  • Deep Learning*
  • Food Ingredients / classification*
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Pattern Recognition, Automated / methods*

Substances

  • Food Ingredients