Evaluating Automated Summarization with Analyst Memories

TR Number

Date

2025-01-13

Journal Title

Journal ISSN

Volume Title

Publisher

Laboratory for Analytic Sciences

Abstract

Automatic summarization remains a challenging area in natural language processing, particularly in the development of robust evaluation metrics. In this work we attempted to develop a task-specific summarization evaluation method by examining intelligence analyst memories for documents and summaries. We ran a feasibility study to see if analyst memories for full texts one day later compare to what is included in automatic summaries as a way of measuring summary quality. We find memories are comparable to summaries, but that methodology tweaks are likely necessary before that comparison can serve as an evaluation of varied summaries. We also compared analyst memories for full texts versus summary texts to see the impact summarization has on memory. We indeed see different information is retained based on what document analysts saw - particularly more details were recalled from full texts while summary texts were more often incorporated into broad statements about multiple documents. We conclude that there is merit to examining memory as a form of summary evaluation - both as a way of thinking about how to summarize and how to incorporate summaries into analyst workflows.

Description

Keywords

automatic summarization evaluation, feasibility study, Human-machine teaming, memory

Citation