•  
  •  
 

Keywords

natural language processing, artificial intelligence, computer science, low-resource NLP, multilingual NLP, table-to-text generation

Abstract

Generating faithful text descriptions from data tables is a significant challenge in Natural Language Processing (NLP), especially for the world’s many low-resource languages. This paper investigates whether Question-Answer (QA) blueprints—an intermediate planning step where a model first asks and answers questions about the data—can improve the factual accuracy of multilingual table-to-text generation. This novel approach is tested on the TaTA dataset, which includes several African languages, by finetuning models with and without these blueprints.

The results show a key distinction: while the QA blueprint method improves performance for English-only models, these gains disappear in the multilingual setting. This paper’s analysis reveals two primary reasons for this failure: 1) errors are introduced when machine-translating the blueprints from English into other languages, and 2) the models struggle to adhere to the plans they generated.

The main contribution of this research is a detailed diagnosis of the challenges of applying QA blueprints to this task. By pinpointing these specific challenges, this research provides valuable insights for the field, demonstrating that advanced techniques require more than simple translation and suggesting that future work should explore strategies like constrained decoding to enforce blueprint adherence.

Share

COinS