FaaSr: Cross-Platform Function-as-a-Service Serverless Scientific Workflows in R

dc.contributor.authorPark, Sungjaeen
dc.contributor.authorThomas, R. Quinnen
dc.contributor.authorCarey, Cayelan C.en
dc.contributor.authorDelany, Austin D.en
dc.contributor.authorKu, Yun-Jungen
dc.contributor.authorLofton, Mary E.en
dc.contributor.authorFigueiredo, Renato J.en
dc.date.accessioned2024-12-20T19:40:55Zen
dc.date.available2024-12-20T19:40:55Zen
dc.date.issued2024-09en
dc.description.abstractModern Function-as-a-Service (FaaS) cloud platforms offer great potential for supporting event-driven scientific workflows. Nonetheless, there remain barriers to adoption by the scientific community in domains such as environmental sciences, where R is the focal language used for the development of applications and where users are typically not well-versed with FaaS APIs. This paper describes the design and implementation of FaaSr, a novel middleware system that supports event-driven scientific workflows in R. A key novelty in FaaSr is the ability to deploy workflows across FaaS providers without the need for any managed servers for coordination. With FaaSr: 1) functions are written in R; 2) the runtime environments for their execution are customizable containers; 3) functions access data in cloud storage (S3) with a familiar file-based abstraction supporting both full file put/get primitives and subsetting using the Parquet format; and 4) function invocation and workflow coordination only requires S3 cloud object storage, without relying on any dedicated, active workflow engine server or cloud-specific queues/databases. The paper reports on the functionality and performance of FaaSr for micro-benchmarks and two case studies: event-driven forecast and batch job workflows. These demonstrate the ability to deploy workflows across multiple platforms (GitHub Actions, Amazon Web Services Lambda, and the open-source OpenWhisk), without the need for dedicated coordination servers, across both cloud and edge resources. FaaSr is open-source and available as a CRAN package.en
dc.description.versionAccepted versionen
dc.format.extent10 page(s)en
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1109/e-Science62913.2024.10678660en
dc.identifier.isbn979-8-3503-6562-7en
dc.identifier.issn2325-372Xen
dc.identifier.orcidThomas, Robert [0000-0003-1282-7825]en
dc.identifier.urihttps://hdl.handle.net/10919/123861en
dc.language.isoenen
dc.publisherIEEEen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectclouden
dc.subjectcyberinfrastructureen
dc.subjectFunction-as-a-Serviceen
dc.subjectserverlessen
dc.subjectworkflowen
dc.subjectLAKEen
dc.titleFaaSr: Cross-Platform Function-as-a-Service Serverless Scientific Workflows in Ren
dc.title.serial2024 IEEE 20TH International Conference on E-Science, E-Science 2024en
dc.typeConference proceedingen
dc.type.dcmitypeTexten
dc.type.otherProceedings Paperen
dc.type.otherBook in seriesen
pubs.finish-date2024-09-20en
pubs.organisational-groupVirginia Techen
pubs.organisational-groupVirginia Tech/Scienceen
pubs.organisational-groupVirginia Tech/Science/Biological Sciencesen
pubs.organisational-groupVirginia Tech/All T&R Facultyen
pubs.organisational-groupVirginia Tech/Science/COS T&R Facultyen
pubs.organisational-groupVirginia Tech/Post-docsen
pubs.start-date2024-09-16en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
eScience_Paper_FaaSr.pdf
Size:
1.28 MB
Format:
Adobe Portable Document Format
Description:
Accepted version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Plain Text
Description: