每个高度选择框架一个托肯:实现对长视频理解的极端压缩
One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding
作者
Authors:
Zheyu Zhang | Ziqi Pang | Shixing Chen | Xiang Hao...
期刊
Journal:
arXiv
年份
Year:
2026
分类
Category:
计算机视觉
Computer Vision