Selección de Contenido Relevante mediante Modelos de Lenguaje Posicionales: Un Análisis Experimental

Como muchas áreas en el ámbito del Procesamiento de Lenguaje Natural, la generación extractiva de resúmenes ha sucumbido a la tendencia general marcada por el éxito de los enfoques de aprendizaje profundo y redes neuronales. Sin embargo, los recursos que tales aproximaciones requieren — computacionales, temporales, datos — no siempre están disponibles. En este trabajo

A Discourse-Informed Approach for Cost-Effective Extractive Summarization

This paper presents an empirical study that harnesses the benefits of Positional Language Models (PLMs) as key of an effective methodology for understanding the gist of a discursive text via extractive summarization. We introduce an unsupervised, adaptive, and cost-efficient approach that integrates semantic information in the process. Texts are linguistically analyzed, and then semantic information—specifically

Optimizing Data-Driven Models for Summarization as Parallel Tasks

This paper presents tackling of a hard optimization problem of computational linguistics, specifically automatic multi-document text summarization, using grid computing. The main challenge of multi-document summarization is to extract the most relevant and unique information effectively and efficiently from a set of topic-related documents, constrained to a specified length. In the Big Data/Text era, where

Applying Natural Language Processing Techniques to Generate Open Data Web APIs Documentation

Information access globalisation has resulted in the continuous growing of online available data on the Web, especially open data portals. However, in current open data portals, data is difficult to understand and access. One of the reasons of such difficulty is the lack of suitable mechanisms to extract and learn valuable information from existing open

The Impact of Rule-Based Text Generation on the Quality of Abstractive Summaries

In this paper we describe how an abstractive text summarization method improved the informativeness of automatic summaries by integrating syntactic text simplification, subject-verb-object concept frequency scoring and a set of rules that transform text into its semantic representation. We analyzed the impact of each component of our approach on the quality of generated summaries and

Overview of the eHealth Knowledge Discovery Challenge at IberLEF 2019

The eHealth Knowledge Discovery Challenge, hosted at IberLEF 2019, proposes an evaluation task for the automatic identification of key phrases and the semantic relations between them in health-related documents in Spanish language. This paper describes the challenge design, evaluation metrics, participants and main results. The most promising approaches are analyzed and the significant challenges are

Volver arriba