An Overview of Jupyter Notebooks
Data is an important part of today’s software engineering ecosystem. The data that is collected by an individual or a company is not worth much unless it can be shared, distributed, and analyzed. Without a standardized method of distributing the data, it becomes cumbersome for someone to view the data. There could be situations in which someone has to troubleshoot code, download and install libraries, or if the original data is corrupted, it yields the wrong results. What Project Jupyter aims to achieve is a way of packaging data so that it becomes easy to view and share.
In the web development world it is possible to share snippets of code, but those solutions often involve a third party service in order to even run it. Think of JSFiddle or CodePen. They are great at sharing code, but it’s difficult to share an entire project. What if an author of a data project could create a package that simply works without worrying about the end user being able to view the data?
A Jupyter Notebook is like a digital trapper keeper filled with everything an individual or organization needs to analyze data. This is a key feature of Jupyter. Each Notebook has not only the data contained within it, but also the necessary software libraries needed to view the data and code contained therein. Each Notebook is completely self-contained; anyone who accesses a Notebook can run it and view the data as intended. If the author so chooses, they can package Jupyter Notebooks in a way in which viewers don’t even have to download anything at all. The data can be hosted on a server and rendered out as HTML. What this means for the original author(s) of a Jupyter Notebook is that the code or data within a Notebook is perfectly preserved and ready to be viewed by anyone who has access. Users can even fork the Notebooks themselves so they can adjust the findings within the Notebook.
Once a Jupyter Notebook is created it can be packaged to be distributed in a wide variety of ways. The Notebook can be uploaded and rendered as an HTML page; it can be put on a GitHub to be downloaded later; the Notebook can be put on a private server; or the Jupyter Notebook can run locally for a presentation. The Jupyter Notebooks can be hosted on a cloud platform so the end user doesn’t have to install anything on their own computer. All the calculations can be performed remotely. Again, the sheer flexibility of Jupyter Notebooks is what makes it such a fascinating option for sharing data.
One of the most well-known uses of Jupyter Notebooks comes from winners of the 2017 Nobel Prize in Physics. Scientists Rainer Weiss, Barry C. Barish, and Kip S. Thorne recorded the space phenomenon known as gravitational waves. Through their research they concluded that the waves were a result of two black holes colliding into each other. In order to share this data, they created a Jupyter Notebook in which users could view their data and even see the data in action in real time.
Jupyter Notebooks work well within education. Recently colleges and universities have developed courses in which the class- and homework revolve around studying, viewing, analyzing, and creating Jupyter Notebooks. In Canada, the Pacific Institute for the Mathematical Sciences (PIMS), Compute Canada and Cybera teamed together to created a special version of Jupyter called syzygy.ca for students and professors across several Canadian Universities.
It is almost limitless how a programmer could utilize Jupyter Notebooks in their work. Whether it’s for data sharing, education, or research, the Jupyter Notebook is a tool that will be indispensable for software engineers in the future.
The Jupyter Project is committed to being open source. All the necessary tools, extensions, and complimentary programs needed to get started with Jupyter Notebooks are available to individuals under a self-proclaimed liberal BSD license.