Can LLMs Recommend More Responsible Prompts?

Santana, Vagner; Berger, Sara; Machado, Tiago; de Macedo, Maysa Malfiza; Sanctos, Cassia; Williams, Lemara; Wu, Zhaoqing

Can LLMs Recommend More Responsible Prompts?

dc.contributor.author	Santana, Vagner	en
dc.contributor.author	Berger, Sara	en
dc.contributor.author	Machado, Tiago	en
dc.contributor.author	de Macedo, Maysa Malfiza	en
dc.contributor.author	Sanctos, Cassia	en
dc.contributor.author	Williams, Lemara	en
dc.contributor.author	Wu, Zhaoqing	en
dc.date.accessioned	2025-04-04T12:12:33Z	en
dc.date.available	2025-04-04T12:12:33Z	en
dc.date.issued	2025-03-24	en
dc.date.updated	2025-04-01T07:48:10Z	en
dc.description.abstract	Human-Computer Interaction practitioners have been proposing best practices in user interface design for decades. However, generative Artificial Intelligence (GenAI) brings additional design considerations and currently lacks sufficient user guidance regarding affordances, inputs, and outputs. In this context, we developed a recommender system to promote responsible AI (RAI) practices while people prompt GenAI systems, by recommending addition of sentences based on social values and removal of harmful sentences. We detail a lightweight recommender system designed to be used in prompting-time and compare its recommendations to the ones provided by three base large language models (LLMs) and two LLMs fine-tuned for the task, i.e., recommending inclusion of sentences based on social values and removal of harmful sentences from a given prompt. Results indicate that our approach has the best F1-score balance in terms of recommendations for additions and removal of sentences to promote responsible prompts, while a fine-tuned model obtained the best F1-score for additions, and our approach obtained the best F1-score for removals of harmful sentences. In addition, fine-tuned models improved the objectiveness of responses by reducing the verbosity of generated content in 93% when compared to the content generated by base models. Presented findings contribute to RAI by showing the limits and bias of existing LLMs in terms of recommendations on how to create more responsible prompts and how open-source technologies can fill this gap in prompting-time.	en
dc.description.version	Published version	en
dc.format.mimetype	application/pdf	en
dc.identifier.doi	https://doi.org/10.1145/3708359.3712137	en
dc.identifier.uri	https://hdl.handle.net/10919/125137	en
dc.language.iso	en	en
dc.publisher	ACM	en
dc.rights	Creative Commons Attribution 4.0 International	en
dc.rights.holder	The author(s)	en
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	en
dc.title	Can LLMs Recommend More Responsible Prompts?	en
dc.type	Article - Refereed	en
dc.type.dcmitype	Text	en