Community-Driven Data Analysis Training for Biology

Cell Syst. 2018 Jun 27;6(6):752-758.e1. doi: 10.1016/j.cels.2018.05.012.

Abstract

The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at https://training.galaxyproject.org.

Keywords: data analysis; genomics; next-generation sequencing; proteomics; training.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology / education*
  • Computational Biology / methods*
  • Curriculum
  • Data Analysis
  • Education, Distance / methods
  • Education, Distance / trends
  • Humans
  • Research Personnel / education*
  • Software