TLDR: We built a library called testbook for unit testing notebooks.
Jupyter Notebooks have been around for quite some time now and are being used extensively in the data science domain - mainly for experimentation and visualization. However, in the recent times, notebooks have been making headway into production environments.
Some companies like Netflix have readily adopted notebooks in their production pipelines using tools like papermill and nteract. Read more about that here. With this, it becomes extremely important to have reliable and maintainable notebooks.
Previous attempts at unit testing notebooks involved:
- Writing tests in the notebook itself: This approach makes the hard to read and maintain.
- Testing against saved notebook outputs: This would not be useful in real world situations where data usually changes, or when there are non-deterministic outputs to be tested.
- Writing integration tests for the whole notebook:: There are tools like papermill which can be used for this. (Although the purpose of papermill is slightly different)
However, for unit tests there was no such library available. So.. Matthew and I built one!
Testbook is a unit testing framework for testing code in Jupyter Notebooks. Testbook will allow for unit tests to be run against notebooks in separate test files, hence treating .ipynb files as .py files.
Here is an example of a unit test written using testbook
Consider the following code cell in a Jupyter Notebook:
def func(a, b): return a + b
You would write a unit test using testbook in a Python file as follows:
import testbook @testbook.testbook('/path/to/notebook.ipynb', execute=True) def test_func(tb): func = tb.ref("func") assert func(1, 2) == 3
We designed the API in such a way that it looks and feels like a normal unit test, except it would be running in the notebook. Also, since testbook is only provides the assertion part of the equation (pun intended), you can use testbook with practically any unit testing Python framework.
- Write conventional unit tests for Jupyter Notebooks
- Execute all or some specific cells before unit test
- Share kernel context across multiple tests (using pytest fixtures)
- Support for patching objects
- Inject code into Jupyter notebooks
- Works with any unit testing library - unittest, pytest or nose
Testbook can be used to write evaluation scripts for assignments submitted using Jupyter Notebooks. The team at Jovian.ml is currently using testbook to conduct automated evaluation of assignments submitted for their course Data Analysis with Python: Zero to Pandas.
Testing ETL Workflows
The team at National Solar Observatory, Colorado have investigated the use of testbook for testing their ETL workflows which are written using notebooks.
Testbook has been very well received by the community so far.
So happy to see an OSS solution for unit testing @ProjectJupyter notebooks. This is a game-changer for notebook reliability, and it substantially increases the viability of notebooks for mission-critical production use. Nice work, @imrohitsanj! 👏 https://t.co/o7rbC6N8w7— Michelle Ufford (@MichelleUfford) July 1, 2020
Nice! So good to have tests in a separate file than the artifact (notebook) being tested.— Amit Rathi (@amittrathi) July 1, 2020
We are desperately looking for more users to try out testbook.
If you are looking for assistance on setting up a unit testing suite for your notebooks, you can reach out to me directly at email@example.com and I would be more than willing to help you.
Thank you for reading my blog! Please let me know what you think in the comments below, I’d love to have a chat.