Combining the power of R and Python: using Jupyter Notebook for data analysis

doi:None

Title	Authors

Home

When we want to choose a programming language as a well-powered tool to analyse our data we face with a high amount of „Python versus R” debates. The aim of my presentation is not to extend the long line of debates, rather to introduce how to combine the best features of these two programming languages. Jupyter Notebook (former IPython Notebook) is an interactive developer tool (you can try it under https://try.jupyter.org/). It is not a standalone programming language, rather a HTML and JSON based server-client application in which we can edit and execute notebook documents via a web browser (1). This could be done with the help of a kernel which compiles the code embedded in the notebook written on the predefined programming language. There are many kernels which can be used (eg. Python, R, Matlab, for full list see https://github.com/jupyter/jupyter/wiki/Jupyter-kernels/). And that is the point when we can profit from the advantages of Python. It is a Swiss Army Knife with the ability of establishing contact with other kernels, additionally push and pull variables across programming languages. Imagine if we can combine the visualization power and simplicity of seaborn (Python package) with the flexibility of pandas (Python package), meanwhile we can use the full functionality of tidyverse (R library), create tons of fancy mixed models and perform Bayesian analysis what we are only able to do in R during an interactive coding process. This is the opportunity what Jupyter Notebook offers us. The code in a notebook is splitted up into cells which can be themselves executed while the variables are stored after the execution and can be accessed from the other cells. Whit this feature we are capable to build and immediately show a bar chart from our data in the middle of coding and then continue the analysis, what makes it an extremely powerful and flexible method for data processing, visualization and analysis. Python has many packages to communicate with other kernels (eg. RPy2 for R) and this feature has a cell magic integration in Jupyter Notebook which we can write entire code blocks in other languages during a Python session using our previously defined variables. Additionally, Jupyter provides us a dashboard to manage our documents and kernels with tons of other advanced features (eg. using comments and markdowns combined with LaTeX document preparation system) which makes us able to turn our notebook into a fully formatted documentation and analysis of an experiment. I want to introduce you the benefits of this new and flexible approach to the programmed statistics illustrated with concrete examples. (1) What is the Jupyter Notebook? (http://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html, retrieved on 29/10/2017)

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.

Create an Account Learn More Hide this message

Main content

Home

Menu

Start managing your projects on the OSF today.

Main content

Links to this project

Home

Menu

Add new wiki page

Page permissions have changed

Wiki page deleted

Connected to the collaborative wiki

Connecting to the collaborative wiki

Collaborative wiki is unavailable

Browser unsupported

Start managing your projects on the OSF today.