Jupyter and Galaxy: Easing entry barriers into complex data analyses for biomedical researchers

Björn A Grüning; Eric Rasche; Boris Rebolledo-Jaramillo; Carl Eberhard; Torsten Houwaart; John Chilton; Nate Coraor; Rolf Backofen; James Taylor; Anton Nekrutenko

doi:10.1371/journal.pcbi.1005425

Jupyter and Galaxy: Easing entry barriers into complex data analyses for biomedical researchers

PLoS Comput Biol. 2017 May 25;13(5):e1005425. doi: 10.1371/journal.pcbi.1005425. eCollection 2017 May.

Authors

Björn A Grüning^{1

2}, Eric Rasche³, Boris Rebolledo-Jaramillo⁴, Carl Eberhard⁵, Torsten Houwaart¹, John Chilton⁶, Nate Coraor⁶, Rolf Backofen^{1

2}, James Taylor⁵, Anton Nekrutenko⁶

Affiliations

¹ Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University, Freiburg, Freiburg, Germany.
² Center for Biological Systems Analysis (ZBSA), University of Freiburg, Freiburg, Germany.
³ Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas, United States of America.
⁴ Centro de Genética y Genómica, Universidad del Desarrollo, Santiago, Chile.
⁵ Department of Biology, Johns Hopkins University, Baltimore, Maryland, United States of America.
⁶ Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America.

Abstract

What does it take to convert a heap of sequencing data into a publishable result? First, common tools are employed to reduce primary data (sequencing reads) to a form suitable for further analyses (i.e., the list of variable sites). The subsequent exploratory stage is much more ad hoc and requires the development of custom scripts and pipelines, making it problematic for biomedical researchers. Here, we describe a hybrid platform combining common analysis pathways with the ability to explore data interactively. It aims to fully encompass and simplify the "raw data-to-publication" pathway and make it reproducible.

Publication types

Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural

MeSH terms

Biomedical Research / methods*
Biomedical Research / organization & administration*
Computational Biology*
High-Throughput Nucleotide Sequencing*
Humans
Research Personnel*
Software*

Grants and funding

U41 HG006620/HG/NHGRI NIH HHS/United States