FaaSr: Cross-Platform Function-as-a-Service Serverless Scientific Workflows in R
dc.contributor.author | Park, Sungjae | en |
dc.contributor.author | Thomas, R. Quinn | en |
dc.contributor.author | Carey, Cayelan C. | en |
dc.contributor.author | Delany, Austin D. | en |
dc.contributor.author | Ku, Yun-Jung | en |
dc.contributor.author | Lofton, Mary E. | en |
dc.contributor.author | Figueiredo, Renato J. | en |
dc.date.accessioned | 2024-12-20T19:40:55Z | en |
dc.date.available | 2024-12-20T19:40:55Z | en |
dc.date.issued | 2024-09 | en |
dc.description.abstract | Modern Function-as-a-Service (FaaS) cloud platforms offer great potential for supporting event-driven scientific workflows. Nonetheless, there remain barriers to adoption by the scientific community in domains such as environmental sciences, where R is the focal language used for the development of applications and where users are typically not well-versed with FaaS APIs. This paper describes the design and implementation of FaaSr, a novel middleware system that supports event-driven scientific workflows in R. A key novelty in FaaSr is the ability to deploy workflows across FaaS providers without the need for any managed servers for coordination. With FaaSr: 1) functions are written in R; 2) the runtime environments for their execution are customizable containers; 3) functions access data in cloud storage (S3) with a familiar file-based abstraction supporting both full file put/get primitives and subsetting using the Parquet format; and 4) function invocation and workflow coordination only requires S3 cloud object storage, without relying on any dedicated, active workflow engine server or cloud-specific queues/databases. The paper reports on the functionality and performance of FaaSr for micro-benchmarks and two case studies: event-driven forecast and batch job workflows. These demonstrate the ability to deploy workflows across multiple platforms (GitHub Actions, Amazon Web Services Lambda, and the open-source OpenWhisk), without the need for dedicated coordination servers, across both cloud and edge resources. FaaSr is open-source and available as a CRAN package. | en |
dc.description.version | Accepted version | en |
dc.format.extent | 10 page(s) | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.doi | https://doi.org/10.1109/e-Science62913.2024.10678660 | en |
dc.identifier.isbn | 979-8-3503-6562-7 | en |
dc.identifier.issn | 2325-372X | en |
dc.identifier.orcid | Thomas, Robert [0000-0003-1282-7825] | en |
dc.identifier.uri | https://hdl.handle.net/10919/123861 | en |
dc.language.iso | en | en |
dc.publisher | IEEE | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | cloud | en |
dc.subject | cyberinfrastructure | en |
dc.subject | Function-as-a-Service | en |
dc.subject | serverless | en |
dc.subject | workflow | en |
dc.subject | LAKE | en |
dc.title | FaaSr: Cross-Platform Function-as-a-Service Serverless Scientific Workflows in R | en |
dc.title.serial | 2024 IEEE 20TH International Conference on E-Science, E-Science 2024 | en |
dc.type | Conference proceeding | en |
dc.type.dcmitype | Text | en |
dc.type.other | Proceedings Paper | en |
dc.type.other | Book in series | en |
pubs.finish-date | 2024-09-20 | en |
pubs.organisational-group | Virginia Tech | en |
pubs.organisational-group | Virginia Tech/Science | en |
pubs.organisational-group | Virginia Tech/Science/Biological Sciences | en |
pubs.organisational-group | Virginia Tech/All T&R Faculty | en |
pubs.organisational-group | Virginia Tech/Science/COS T&R Faculty | en |
pubs.organisational-group | Virginia Tech/Post-docs | en |
pubs.start-date | 2024-09-16 | en |