TurboESM:用于蛋白质语言模型并带有正交旋转和QJL校正的超能3-Bit KV缓存量化
TurboESM: Ultra-Efficient 3-Bit KV Cache Quantization for Protein Language Models with Orthogonal Rotation and QJL Correction
作者
Authors
暂无作者信息
期刊
Journal
暂无期刊信息
年份
Year
-
分类
Category
国家
Country
-
📝 摘要
Abstract
The rapid scaling of Protein Language Models (PLMs) has unlocked unprecedented accuracy in protein structure prediction and design, but the quadratic memory growth of the Key-Value (KV) cache during inference remains a prohibitive barrier for single-GPU deployment and high-throughput generation. While 8-bit quantization is now standard, 3-bit quantization remains elusive due to severe numerical outliers in activations. This paper presents TurboESM, an adaptation of Google's TurboQuant to the PLM
📊 文章统计
Article Statistics
基础数据
Basic Stats
9
浏览
Views
0
下载
Downloads
0
引用
Citations
引用趋势
Citation Trend
阅读国家分布
Country Distribution
阅读机构分布
Institution Distribution
月度浏览趋势
Monthly Views