海洋科研 - 国际顶刊文献统计平台

When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels

作者
Authors: Sushant Gautam | Finn Schwall | Annika Willoch Ols... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 190 📚 29

EMO: Pretraining Mixture of Experts for Emergent Modularity

作者
Authors: Ryan Wang | Akshita Bhagia | Sewon Min 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 204 📚 23

Implicit Representations of Grammaticality in Language 模型 (Model)s

Implicit Representations of Grammaticality in Language Models

作者
Authors: Yingshan Susan Wang | Linlu Qiu | Zhaofeng Wu | Ro... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 118 📚 3

Safety and accuracy follow different scaling laws in clinical large language models

作者
Authors: Sebastian Wind | Tri-Thien Nguyen | Jeta Sopa | Ma... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 88 📚 28

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

作者
Authors: Quang Hieu Pham | Yang He | Ping Nie | Canwen Xu |... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 213 📚 20

When LLMs Stop Following Steps: A Diagnostic Study of Procedural Execution in Language 模型 (Model)s

When LLMs Stop Following Steps: A Diagnostic Study of Procedural Execution in Language Models

作者
Authors: Sailesh Panda | Pritam Kadasi | Abhishek Upperwal ... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 148 📚 26

On the Proper Treatment of Units in Surprisal Theory

作者
Authors: Samuel Kiegeland | Vésteinn Snæbjarnarson | Tim Vi... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 162 📚 15

Exploration Hacking: Can LLMs Learn to Resist RL Training?

作者
Authors: Eyon Jang | Damon Falck | Joschka Braun | Nathalie... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 106 📚 7

Select to Think: Unlocking SLM Potential with Local Sufficiency

作者
Authors: Wenxuan Ye | Yangyang Zhang | Xueli An | Georg Car... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 75 📚 9

DV-World:真实世界情景中数据可视化代理的基准化

DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios

作者
Authors: Jinxiang Meng | Shaoping Huang | Fangyu Lei | Jing... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 184 📚 23

通过多任务BILSTM和自动ML基准制定对印度尼西亚电子商务的感知和情感分类

Sentiment and Emotion Classification of Indonesian E-Commerce Reviews via Multi-Task BiLSTM and Auto...

作者
Authors: Hermawan Manurung | Ibrahim Al-Kahfi | Ahmad Rizqi... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 115 📚 24

AI探员怎么花你的钱? 在代理编码任务中分析和预测托肯消费

How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Task...

作者
Authors: Longju Bai | Zhemin Huang | Xingyao Wang | Jiao Su... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 125 📚 3

当提示覆盖视野: LVLMs 的提示诱发幻觉时

When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

作者
Authors: Pegah Khayatan | Jayneel Parekh | Arnaud Dapogny |... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 211 📚 18

MathDuels:将LLMs评价为问题概率和解决方案

MathDuels: Evaluating LLMs as Problem Posers and Solvers

作者
Authors: Zhiqiu Xu | Shibo Jin | Shreya Arya | Mayur Naik 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 118 📚 25

对使用基因大语言模型进行自动语音识别的评价

Evaluation of Automatic Speech Recognition Using Generative Large Language Models

作者
Authors: Thibault Bañeras-Roux | Shashi Kumar | Driss Khali... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 61 📚 12

SpeechParaling-Bench:辅助语言学-助词生成综合基准

SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation

作者
Authors: Ruohan Liu | Shukang Yin | Tao Wang | Dong Zhang |... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 74 📚 20

发现共享逻辑子空间:通过对接自然语言和符号视图引导 LLM 逻辑理性

Discovering a Shared Logical Subspace: Steering LLM Logical Reasoning via Alignment of Natural-Langu...

作者
Authors: Feihao Fang | My T. Thai | Yuanyuan Lei 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 134 📚 4

塞萨:有选择的州际空间注意

Sessa: Selective State Space Attention

作者
Authors: Liubomyr Horbatko 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 171 📚 29

为非正式定理演示学习透视理性

Learning to Reason with Insight for Informal Theorem Proving

作者
Authors: Yunhe Li | Hao Shi | Bowen Deng | Wei Wang | Mengz... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 194 📚 30

库埃瓦尔:在社会困境中确定合作基准-维持机制和LLM代理人

CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas

作者
Authors: Emanuel Tewolde | Xiao Zhang | David Guzman Piedra... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 自然语言处理
NLP

👁 139 📚 7

📚 文献库Literature Library

📚 文献库
Literature Library