Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
Sharing and repeating scientific applications is crucial for verifying claims, reproducing experimental results (e.g., to repeat a computational experiment described in a publication), and promoting reuse of complex applications. The predominant methods of sharing and making applications repeatable are building a companion web site and/or provisioning a virtual machine image (VMI). Recently, application virtualization (AV), has emerged as a light-weight alter- native for sharing and efficient repeatability. AV approaches such as Linux Containers create a chroot-like environment, while approaches such as CDE trace system calls during application execution to copy all binaries, data, and software dependencies into a self-contained package. In principle, application virtualization techniques can also be applied to DB applications, i.e., applications that interact with a relational database. However, these techniques treat a database system as a black-box application process and are thus oblivious to the query statements or database model supported by the database system. To overcome this shortcoming, and leverage database semantics, we have introduced light-weight database virtualization (LDV)1 is light-weight as it encapsulates only the application and its necessary and relevant dependencies (input files, binaries, and libraries) as well as only the necessary and relevant data from the database with which the application interacted with. LDV relies on data provenance to determine which database tuples and input files are relevant. While monitoring an application to create a package we incrementally construct an execution trace (provenance graph), that records dependencies across OS and DB boundaries. In addition to providing a detailed record of how files and tuples have been produced by the application, we use it to determine what should be included in the package. The primary objective of this demonstration is to show the benefits of using LDV for repeating and understanding DB applications. For this, we consider real-world data sharing scenarios that involve a database and highlight the sharing and reproducibility challenges associated with them. We give an overview of our LDV approach to show how it can be used to build a light-weight package of a DB application that can be easily shared and reproduced. During the demonstration, the audience will experience three key features of LDV: (i) its ability to create self-contained pack- ages of a DB application that can be shared and run on different machine configurations without the need to install a database system and setup a database, (ii) how LDV extracts a slice of the database accessed by an application, (iii) how LDV’s execution traces can be used to understand how the files, processes, SQL operations, and database content of an application are related to each other. 1 https://github.com/legendOfZelda/LDV.git, a tool for creating packages of DB applications. An LDV package
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.