Computer Science 5th Year Masters Presentation

— 12:00pm

Location:
Virtual Presentation - ET - Remote Access - Zoom

Speaker:
YUAN JIANG , Masters StudentComputer Science DepartmentCarnegie Mellon University
https://www.linkedin.com/in/yuan-jiang-52392b17a

Elevating Jupyter Notebook Maintenance Tooling by Identifying and Extracting Notebook Structures

Data analysis is an exploratory, interactive, and often collaborative process. Computational notebooks have become a popular tool to support this process, among others because of their ability to interleave code, narrative text, and results. The exploratory nature of computational notebooks allows their users to edit and execute parts of their program in any order. However, notebooks in practice are often criticized as hard to maintain and being of low code quality, including problems such as unused or duplicated code and out-of-order code execution. Data scientists can benefit from better tool support when maintaining and evolving notebooks. We argue that central to such tool support is identifying the structure of notebooks. We present a lightweight and accurate approach to extract notebook structure and outline several ways such structure can be used to improve maintenance tooling for notebooks, including navigation and finding common structural patterns. In addition, we investigate the history of notebooks and present an approach to visualize how notebooks evolve over time. This visualization can be useful for understanding how a notebook changes over a series of versions and identifying alternatives explored in different stages of a data analysis pipeline. Thesis Committee: Christian Kästner (Chair) Eunsuk Kang Shurui Zhou (University of Toronto) Additional Information

Zoom Participation. See announcement.

For More Information:
tracyf@cs.cmu.edu


Add event to Google
Add event to iCal