登录 注册

GS-QA: A Benchmark for Geospatial Question Answering

🔗 访问原文
🔗 Access Paper

📝 摘要
Abstract

Recent advances in Large Language Models (LLMs) have led to dramatic improvements in question answering (QA). To address the challenge of evaluating QA systems, standardized benchmarks have been introduced. This work focuses on the problem of geospatial QA, where a large collection of geospatial data is available in the form of a spatial database or other forms. Existing work on geospatial QA benchmarks has various limitations, including a small number of questions, limited spatial predicates, narrow output types, and no multi-source reasoning. We present GS-QA, an extensible geospatial QA benchmark with 2,800 question-answer pairs across 28 templates on top of OpenStreetMap and Wikipedia data, covering a wide range of spatial objects, predicates (including directional and towards filtering), and answer types (entity names, locations, distances, directions, counts, and aggregated areas/lengths). A key feature of GS-QA is that some questions require combining information from multiple sources, e.g., geospatial information from OSM and factual information from Wikipedia. GS-QA includes a comprehensive evaluation methodology that combines text-based QA measures with geospatial-specific measures such as distance error and angular error. We implemented nine LLM-based geospatial QA baselines using three LLMs (GPT-4o, Claude Sonnet 4.6, and Ministral-3) with combinations of direct prompting, retrieval-augmented generation, and text-to-SQL. Our results show that existing solutions perform reasonably well on simple spatial predicates with entity name outputs, but accuracy degrades significantly for questions involving complex spatial predicates, numeric output types, and multi-source reasoning, demonstrating that geospatial QA remains a challenging open problem warranting further research.

📊 文章统计
Article Statistics

基础数据
Basic Stats

201 浏览
Views
0 下载
Downloads
25 引用
Citations

引用趋势
Citation Trend

阅读国家分布
Country Distribution

阅读机构分布
Institution Distribution

月度浏览趋势
Monthly Views

相关关键词
Related Keywords

影响因子分析
Impact Analysis

6.40 综合评分
Overall Score
引用影响力
Citation Impact
浏览热度
View Popularity
下载频次
Download Frequency

📄 相关文章
Related Articles

海洋智能分析Ocean AI Analysis

正在分析中,请稍候…Analyzing, please wait…
海洋智能体 🌊
海洋智能体
AI科研助手 · 2270篇文献
我看到你正在阅读一篇文献,需要我帮你解读摘要、推荐相关论文,或者分析研究方法论吗?