Monday, October 14, 2024

The scientific research process generally starts with the examination of the state of the art, which may involve a vast number of publications. Automatically summarizing scientific articles would help researchers in their investigation by speeding up the research process. The automatic summarization of scientific articles differs from the summarization of generic texts due to their specific structure and inclusion of citation sentences. Most of the valuable information in scientific articles is presented in tables, figures, and algorithm pseudocode. These elements, however, do not usually appear in a generic text. Therefore, several approaches that consider the particularity of a scientific article structure were proposed to enhance the quality of the generated summary, resulting in ad hoc automatic summarizers.  automatic summarizers. This paper provides a comprehensive study of the state of the art in this field and discusses some future research directions. It particularly presents a review of approaches developed during the last decade, the corpora used, and their evaluation methods. It also discusses their limitations and points out some open problems. The conclusions of this study highlight the prevalence of extractive techniques for the automatic summarization of single monolingual articles using a combination of statistical, natural language processing, and machine learning techniques. The absence of benchmark corpora and gold standard summaries for scientific articles remains the main issue for this task.