登录 注册
登录 注册
空间Evo:通过确定几何环境实现空间情报的自我演变
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments
👁 102 📚 1
场景批评: 三维室内场景合成的符号评价器
SceneCritic: A Symbolic Evaluator for 3D Indoor Scene Synthesis
👁 119 📚 12
在许多代理追踪中发现违反安全的行为
Detecting Safety Violations Across Many Agent Traces
👁 17 📚 5
案件证据核查:证据-敏感监督框架
Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision
👁 116 📚 27
OpenVLThinkerV2: 多领域视觉任务通用多模式理性模型
OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks
👁 69 📚 16
AVGen-Bench:对文本到音频视频生成进行多角度评价的任务驱动基准
AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation
👁 194 📚 9
B. 见而不去想:在多式混合型专家中的循环分流
Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts
👁 204 📚 12
个性化奖励奔驰:用人类相适应的个性化评价奖励模式
Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization
👁 88 📚 14
纸圈:一个开源多剂研究发现和分析框架
Paper Circle: An Open-source Multi-agent Research Discovery and Analysis Framework
👁 184 📚 18
Beyond the Final Actor: 为精致 LLM- Generated text 检测的创建者和编辑者双重角色建模
Beyond the Final Actor: Modeling the Dual Roles of Creator and Editor for Fine-Grained LLM-Generated...
👁 72 📚 2
BAS:评估大语文模型信心的决定理论方法
BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence
👁 98 📚 4
没有单一的多样化最佳模式:学习多样性样本的路由器
No Single Best Model for Diversity: Learning a Router for Sample Diversity
👁 102 📚 0
基因建议 LMs 中新词汇的基调初始化
Grounded Token Initialization for New Vocabulary in LMs for Generative Recommendation
👁 190 📚 4
高效深度放大通用YOCO
Universal YOCO for Efficient Depth Scaling
👁 148 📚 22
基于奖励的在线LLM 通过神经UCB运行
Reward-Based Online LLM Routing via NeuralUCB
👁 70 📚 27
自然语言代理人
Natural-Language Agent Harnesses
👁 123 📚 8
通过证据稀释和书写后补培训知识库
Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment
👁 193 📚 13
在代码评价中比较开发者和LLM比ase
Comparing Developer and LLM Biases in Code Evaluation
👁 114 📚 24
MedObvious:通过临床Triage在VLMs中曝光医学莫拉韦克的Paradox
MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage
👁 124 📚 24
ThinkJEPA:赋予Latent World Models以大视野-语言理性模式的权力
ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model
👁 97 📚 13
🌊