登录 注册
TimeProVe: Propose, then Verify for Efficient Long Video Temporal Reasoning in Activities of Daily L...
👁 140 📚 10
JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising
👁 195 📚 30
Beyond the Current Observation: Evaluating Multimodal Large Language 模型 (Model)s in Controllable Non...
Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Mark...
👁 155 📚 21
Future Dynamic 3D Reconstruction: A 3D World 模型 (Model) with Disentangled Ego-Motion
Future Dynamic 3D Reconstruction: A 3D World Model with Disentangled Ego-Motion
👁 61 📚 9
Context-Aware RL for Agentic and Multimodal LLMs
👁 116 📚 17
OmniVideo-100K: A 数据 (Data)set for Audio-Visual Reasoning through Structured Scripts and Evidence Ch...
OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains
👁 193 📚 3
Modality Forcing for Scalable Spatial Generation
👁 76 📚 19
InterleaveThinker: Reinforcing Agentic Interleaved Generation
👁 146 📚 14
Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language 模型 (Model)s
Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models
👁 40 📚 29
ARM: An AutoRegressive Large Multimodal 模型 (Model) with Unified Discrete Representations
ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations
👁 39 📚 15
Latent Spatial Memory for Video World 模型 (Model)s
Latent Spatial Memory for Video World Models
👁 32 📚 5
UniSHARP: Universal Sharp Monocular View Synthesis
👁 103 📚 2
Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators
👁 196 📚 27
Complexity-Balanced Diffusion Splitting
👁 194 📚 14
PAR3D: A Unified 3D-MLLM with Part-Aware Representation for Scene Understanding
👁 145 📚 12
Exploring Easy Boosts for Lidar Semantic Scene Completion
👁 70 📚 3
Representation Forcing for Bottleneck-Free Unified Multimodal 模型 (Model)s
Representation Forcing for Bottleneck-Free Unified Multimodal Models
👁 190 📚 5
AdaState: Self-Evolving Anchors for Streaming Video Generation
👁 171 📚 3
VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion
👁 61 📚 22
GMOS: Grounding Moving Object Segmentation in 3D Space and Time
👁 164 📚 16
海洋智能体 🌊
海洋智能体
AI科研助手 · 2251篇文献
你好!你正在浏览文献列表,我可以帮你筛选方向、推荐高引论文或解读某个研究领域。