海洋科研 - 国际顶刊文献统计平台

Lift4D: Harmonizing Single-View 3D 估计 (Estimation) for 4D Reconstruction In-the-Wild

Lift4D: Harmonizing Single-View 3D Estimation for 4D Reconstruction In-the-Wild

作者
Authors: Yehonathan Litman | Xiaoxuan Ma | Manan Shah | Nic... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 195 📚 20

TimeProVe: Propose, then Verify for Efficient Long Video Temporal Reasoning in Activities of Daily L...

作者
Authors: Arkaprava Sinha | Dominick Reilly | Siddharth Kris... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 140 📚 10

JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising

作者
Authors: Siang-Ling Zhang | Huai-Hsun Cheng | Tsung-Ju Yang... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 195 📚 30

Beyond the Current Observation: Evaluating Multimodal Large Language 模型 (Model)s in Controllable Non...

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Mark...

作者
Authors: Shengyuan Ding | Xilin Wei | Xinyu Fang | Haodong ... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 155 📚 21

Future Dynamic 3D Reconstruction: A 3D World 模型 (Model) with Disentangled Ego-Motion

Future Dynamic 3D Reconstruction: A 3D World Model with Disentangled Ego-Motion

作者
Authors: Nils Morbitzer | Jonathan Evers | Artem Savkin | T... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 61 📚 9

Context-Aware RL for Agentic and Multimodal LLMs

作者
Authors: Peiyang Xu | Bangzheng Li | Sijia Liu | Karthik R.... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 116 📚 17

OmniVideo-100K: A 数据 (Data)set for Audio-Visual Reasoning through Structured Scripts and Evidence Ch...

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

作者
Authors: Xinyue Cai | Chaoyou Fu | Yi-Fan Zhang | Ran He | ... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 193 📚 3

Modality Forcing for Scalable Spatial Generation

作者
Authors: Bardienus Pieter Duisterhof | Deva Ramanan | Jeffr... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 76 📚 19

InterleaveThinker: Reinforcing Agentic Interleaved Generation

作者
Authors: Dian Zheng | Harry Lee | Manyuan Zhang | Kaituo Fe... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 146 📚 14

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language 模型 (Model)s

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models

作者
Authors: Cheng-Yu Yang | Shao-Yuan Lo | Yu-Lun Liu 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 40 📚 29

ARM: An AutoRegressive Large Multimodal 模型 (Model) with Unified Discrete Representations

ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

作者
Authors: Junke Wang | Xiao Wang | Jiacheng Pan | Xuefeng Hu... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 39 📚 15

Latent Spatial Memory for Video World 模型 (Model)s

Latent Spatial Memory for Video World Models

作者
Authors: Weijie Wang | Haoyu Zhao | Yifan Yang | Feng Chen ... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 32 📚 5

UniSHARP: Universal Sharp Monocular View Synthesis

作者
Authors: Meixi Song | Dizhe Zhang | Hao Ren | Ruiyang Zhang... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 103 📚 2

Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators

作者
Authors: Chenming Zhu | Jingli Lin | Yilin Long | Peizhou C... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 196 📚 27

Complexity-Balanced Diffusion Splitting

作者
Authors: Noam Issachar | Dani Lischinski | Raanan Fattal 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 194 📚 14

PAR3D: A Unified 3D-MLLM with Part-Aware Representation for Scene Understanding

作者
Authors: Shaohui Dai | Yansong Qu | You Shen | Shengchuan Z... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 145 📚 12

Exploring Easy Boosts for Lidar Semantic Scene Completion

作者
Authors: Tetiana Martyniuk | Jonathan Seele | Alexandre Bou... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 70 📚 3

Representation Forcing for Bottleneck-Free Unified Multimodal 模型 (Model)s

Representation Forcing for Bottleneck-Free Unified Multimodal Models

作者
Authors: Yuqing Wang | Zhijie Lin | Ceyuan Yang | Yang Zhao... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 190 📚 5

AdaState: Self-Evolving Anchors for Streaming Video Generation

作者
Authors: Yusuf Dalva | Pinar Yanardag 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 171 📚 3

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

作者
Authors: Hidir Yesiltepe | Jiazhen Hu | Tuna Han Salih Mera... 期刊
Journal: arXiv 年份
Year: 2026 分类
Category: 计算机视觉
Computer Vision

👁 61 📚 22

📚 文献库Literature Library

📚 文献库
Literature Library