MultiNativQA | NativMMQA

Dataset resource

MultiNativQA: multilingual culturally aligned natural QA benchmark

MultiNativQA is a multilingual benchmark built with native-speaker queries and local context for more realistic evaluation of large language models. It supports both benchmarking and fine-tuning across languages with different resource levels.

Hugging Face dataset ACL 2025 paper Back to resources

Built with NativQA Framework

Overview

MultiNativQA is designed to evaluate large language models with queries that reflect how native speakers ask questions in their own languages and regions. Instead of relying on generic or synthetic prompts, the benchmark emphasizes cultural alignment, local knowledge, and realistic information needs across diverse linguistic settings.

Dataset at a glance

3,828

Hugging Face downloads (all-time)

Hugging Face likes

dataset configs

CC BY-NC-SA 4.0

license

64k+

manually annotated QA pairs

languages represented

regions covered

seed topics for query collection

Coverage at a glance

MultiNativQA spans native-speaker queries collected across nine regions and covers both everyday and specialized topics to better stress-test multilingual model behavior in realistic settings.

Animal Business Cloth Education Events Food & Drinks General Geography Immigration Language Literature Names & Persons Plants Religion Sports & Games Tradition Travel Weather

Why this benchmark matters

Native-speaker grounded

Queries come from native speakers, which makes the benchmark closer to real local information needs than template-heavy alternatives.

Region-aware evaluation

The dataset emphasizes cultural and regional variation that multilingual models often miss when evaluation sets are too generic.

Evaluation and tuning ready

The benchmark supports both model evaluation and downstream fine-tuning workflows built on culturally aligned QA data.

Citation

@inproceedings{hasan-etal-2025-nativqa,
  title = "{NativQA:} Multilingual Culturally-Aligned Natural Query for {LLM}s",
  author = "Hasan, Md. Arid and Hasanain, Maram and Ahmad, Fatema and Laskar, Sahinur Rahman and Upadhyay, Sunaya and Sukhadia, Vrunda N and Kutlu, Mucahid and Chowdhury, Shammur Absar and Alam, Firoj",
  booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
  year = "2025",
  address = "Vienna, Austria",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2025.findings-acl.770/",
  doi = "10.18653/v1/2025.findings-acl.770",
  pages = "14886--14909"
}

Metrics are fetched from public APIs when available. Last checked: runtime.