Automatic Restoration and Management of Computational Notebooks
dc.contributor.author | Venkatesan, Satish | en |
dc.contributor.committeechair | Gulzar, Muhammad Ali | en |
dc.contributor.committeemember | Tilevich, Eli | en |
dc.contributor.committeemember | Meng, Na | en |
dc.contributor.department | Computer Science | en |
dc.date.accessioned | 2022-03-04T09:00:22Z | en |
dc.date.available | 2022-03-04T09:00:22Z | en |
dc.date.issued | 2022-03-03 | en |
dc.description.abstract | Computational Notebook platforms are very commonly used by programmers and data scientists. However, due to the interactive development environment of notebooks, developers struggle to maintain effective code organization which has an adverse effect on their productivity. In this thesis, we research and develop techniques to help solve issues with code organization that developers face in an effort to improve productivity. Notebooks are often executed out of order which adversely effects their portability. To determine cell execution orders in computational notebooks, we develop a technique that determines the execution order for a given cell and if need be, attempt to rearrange the cells to match the intended execution order. With such a tool, users would not need to manually determine the execution orders themselves. In a user study with 9 participants, our approach on average saves users about 95% of the time required to determine execution orders manually. We also developed a technique to support insertion of cells in rows in addition to the standard column insertion to help better represent multiple contexts. In a user study with 9 participants, this technique on a scale of one to ten on average was judged as a 8.44 in terms of representing multiple contexts as opposed to standard view which was judged as 4.77. | en |
dc.description.abstractgeneral | In the field of data science computational notebooks are a very commonly used tool. They allow users to create programs to perform computations and to display graphs, tables and other visualizations to supplement their analysis. Computational Notebooks have some limitations in the development environment which can make it difficult for users to organize their code. This can make it very difficult to read through and analyze the code to find or fix any errors which in turn can have a very negative effect on developer productivity. In this thesis, we research methods to improve the development environment and increase developer productivity. We achieve this by offering tools to the user that can help organize and cleanup their code making it easier to comprehend the code and make any necessary changes. | en |
dc.description.degree | Master of Science | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:33990 | en |
dc.identifier.uri | http://hdl.handle.net/10919/109097 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | Computational Notebooks | en |
dc.subject | Dependency Analysis | en |
dc.subject | Version Control | en |
dc.title | Automatic Restoration and Management of Computational Notebooks | en |
dc.type | Thesis | en |
thesis.degree.discipline | Computer Science and Applications | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |