研究公司公布SwiftKV技术，优化AI模型预制提示词处理过程、降低50%AI推论时间

IT之家

Jan 17, 2025

IT之家 1 月 17 日消息，研究公司 Snowflake 公布了一款名为“SwiftKV”的 AI 模型调校技术，并在 Hugging Face 开源三款利用“SwiftKV”技术进行调校的 Llama 3.1 AI 模型（点此访问）。

IT之家获悉，SwiftKV 技术的核心在于优化模型提示词处理过程。研究人员指出，通常情况下大模型最为消耗计算资源的环节在于处理用户为模型输入的提示词，而许多企业为模型自定义了极长的提示词，据称平均情况“大约是输出生成内容的 10 倍”。

据 Snowflake 介绍，这一“SwiftKV”模型调校技术专门为相应预制的提示词处理进行优化，号称突破了传统的键值（Key-Value，KV）缓存压缩技术，还在模型推理过程中引入模型重组与知识保存自我蒸馏方法，从而有效提升模型吞吐量、降低了延迟和运算成本，号称可以帮助 AI 模型显著缩短推理时间，可以降低模型 50% 推论时间。

实验结果表明，在利用 SwiftKV 技术优化 Llama 3.1 的 80 亿和 700 亿参数模型后，相应模型的整体吞吐量可提升两倍，同时相应模型也在代码自动补全、文本摘要等方面表现出色。

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

1
2
3
4
5
6
7
8
9
10

{"basename":"","ssrTDKData":{"titleTemplate":"%s - Tiger Brokers","title":"Tiger Brokers | Global Stocks, Options & Futures Trading App","description":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","keywords":"tiger brokers,tiger trade,tiger brokers singapore,broker online,stock trading in singapore,share trading singapore,brokerage firm singapore,trading app,stock broker singapore,stock trading platforms,trading account","social":{"ogDescription":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","ogImage":"https://c1.itigergrowtha.com/portal5/static/media/og-logo.be62fbe1.png","ogUrl":"https://www.itiger.com/news/2504627132"},"companyName":"Tiger Brokers"},"pageData":{"isMobile":false,"isTiger":false,"isTTM":true,"region":"SGP","license":"TBSG","edition":"fundamental"},"isCrawlerRequest":true,"__swrFallback__":{"@#url:\"https://stock-news.skytigris.cn/v3/news\",params:#id:\"2504627132\",edition:\"fundamental\",auth_exemption:1,,,undefined,":{"share":"https://ttm.financial/m/news/2504627132?lang=en_US&edition=fundamental","thumbnail":"","is_english":false,"pubTime":"2025-01-17 17:15","share_image_url":"https://static.laohu8.com/e9f99090a1c2ed51c021029395664489","id":"2504627132","market":"sh","top_or_hot":-1,"title":"研究公司公布SwiftKV技术，优化AI模型预制提示词处理过程、降低50%AI推论时间","media":"IT之家","content":"<html><body><p>IT之家 1 月 17 日消息，研究公司 <a href=\"https://laohu8.com/S/SNOW\">Snowflake</a> 公布了一款名为“SwiftKV”的 AI 模型调校技术，并在 Hugging Face 开源三款利用“SwiftKV”技术进行调校的 Llama 3.1 AI 模型（点此访问）。</p><p>IT之家获悉，SwiftKV 技术的核心在于优化模型提示词处理过程。研究人员指出，通常情况下大模型最为消耗计算资源的环节在于处理用户为模型输入的提示词，而许多企业为模型自定义了极长的提示词，据称平均情况“大约是输出生成内容的 10 倍”。</p><p><img src=\"https://x0.ifengimg.com/res/2025/30109BF8464861E4A9D255F2A411E788E86398A4_size36_w640_h448.jpg\"/></p><p>据 Snowflake 介绍，这一“SwiftKV”模型调校技术专门为相应预制的提示词处理进行优化，号称突破了传统的键值（Key-Value，KV）缓存压缩技术，还在模型推理过程中引入模型重组与知识保存自我蒸馏方法，从而有效提升模型吞吐量、降低了延迟和运算成本，号称可以帮助 AI 模型显著缩短推理时间，<strong>可以降低模型 50% 推论时间</strong>。</p><p>实验结果表明，在利用 SwiftKV 技术优化 Llama 3.1 的 80 亿和 700 亿参数模型后，相应模型的整体吞吐量可提升两倍，同时相应模型也在代码自动补全、文本摘要等方面表现出色。</p><p><img src=\"https://x0.ifengimg.com/res/2025/4B64245A56ECB3CCFE02B612C170FD6E566A9317_size19_w640_h252.jpg\"/></p></body></html>","source":"fenghuang_stock","html":"<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0,user-scalable=no\"/>\n<meta name=\"format-detection\" content=\"telephone=no,email=no,address=no\" />\n<title>研究公司公布SwiftKV技术，优化AI模型预制提示词处理过程、降低50%AI推论时间</title>\n<style type=\"text/css\">\na,abbr,acronym,address,applet,article,aside,audio,b,big,blockquote,body,canvas,caption,center,cite,code,dd,del,details,dfn,div,dl,dt,\nem,embed,fieldset,figcaption,figure,footer,form,h1,h2,h3,h4,h5,h6,header,hgroup,html,i,iframe,img,ins,kbd,label,legend,li,mark,menu,nav,\nobject,ol,output,p,pre,q,ruby,s,samp,section,small,span,strike,strong,sub,summary,sup,table,tbody,td,tfoot,th,thead,time,tr,tt,u,ul,var,video{ font:inherit;margin:0;padding:0;vertical-align:baseline;border:0 }\nbody{ font-size:16px; line-height:1.5; color:#999; background:transparent; }\n.wrapper{ overflow:hidden;word-break:break-all;padding:10px; }\nh1,h2{ font-weight:normal; line-height:1.35; margin-bottom:.6em; }\nh3,h4,h5,h6{ line-height:1.35; margin-bottom:1em; }\nh1{ font-size:24px; }\nh2{ font-size:20px; }\nh3{ font-size:18px; }\nh4{ font-size:16px; }\nh5{ font-size:14px; }\nh6{ font-size:12px; }\np,ul,ol,blockquote,dl,table{ margin:1.2em 0; }\nul,ol{ margin-left:2em; }\nul{ list-style:disc; }\nol{ list-style:decimal; }\nli,li p{ margin:10px 0;}\nimg{ max-width:100%;display:block;margin:0 auto 1em; }\nblockquote{ color:#B5B2B1; border-left:3px solid #aaa; padding:1em; }\nstrong,b{font-weight:bold;}\nem,i{font-style:italic;}\ntable{ width:100%;border-collapse:collapse;border-spacing:1px;margin:1em 0;font-size:.9em; }\nth,td{ padding:5px;text-align:left;border:1px solid #aaa; }\nth{ font-weight:bold;background:#5d5d5d; }\n.symbol-link{font-weight:bold;}\n/* header{ border-bottom:1px solid #494756; } */\n.title{ margin:0 0 8px;line-height:1.3;color:#ddd; }\n.meta {color:#5e5c6d;font-size:13px;margin:0 0 .5em; }\na{text-decoration:none; color:#2a4b87;}\n.meta .head { display: inline-block; overflow: hidden}\n.head .h-thumb { width: 30px; height: 30px; margin: 0; padding: 0; border-radius: 50%; float: left;}\n.head .h-content { margin: 0; padding: 0 0 0 9px; float: left;}\n.head .h-name {font-size: 13px; color: #eee; margin: 0;}\n.head .h-time {font-size: 11px; color: #7E829C; margin: 0;line-height: 11px;}\n.small {font-size: 12.5px; display: inline-block; transform: scale(0.9); -webkit-transform: scale(0.9); transform-origin: left; -webkit-transform-origin: left;}\n.smaller {font-size: 12.5px; display: inline-block; transform: scale(0.8); -webkit-transform: scale(0.8); transform-origin: left; -webkit-transform-origin: left;}\n.bt-text {font-size: 12px;margin: 1.5em 0 0 0}\n.bt-text p {margin: 0}\n</style>\n</head>\n<body>\n<div class=\"wrapper\">\n<header>\n<h2 class=\"title\">\n研究公司公布SwiftKV技术，优化AI模型预制提示词处理过程、降低50%AI推论时间\n</h2>\n\n<h4 class=\"meta\">\n\n\n2025-01-17 17:15 北京时间&nbsp;&nbsp;&nbsp;<a href=https://tech.ifeng.com/c/8gDiNmOHYqt><strong>IT之家</strong></a>\n\n\n</h4>\n\n</header>\n<article>\n<div>\n<p>IT之家 1 月 17 日消息，研究公司 Snowflake 公布了一款名为“SwiftKV”的 AI 模型调校技术，并在 Hugging Face 开源三款利用“SwiftKV”技术进行调校的 Llama 3.1 AI 模型（点此访问）。IT之家获悉，SwiftKV 技术的核心在于优化模型提示词处理过程。研究人员指出，通常情况下大模型最为消耗计算资源的环节在于处理用户为模型输入的提示词，而许多...</p>\n\n<a href=\"https://tech.ifeng.com/c/8gDiNmOHYqt\">Source Link</a>\n\n</div>\n\n\n</article>\n</div>\n</body>\n</html>\n","isBrief":false,"type":0,"news_type":1,"symbol":"BK4503","symbol_name":"景林资产持仓","start_time":0,"source_url":"https://tech.ifeng.com/c/8gDiNmOHYqt","article_id":"2504627132","we_media_id":null,"thumbnails":[],"rights":null,"url":"https://stock-news.laohu8.com/highlight/detail?id=2504627132","pubTimestamp":1737105352,"columns":[],"sourceInfo":{"source_id":"fenghuang_stock","name":"凤凰网"},"weMediaInfo":null,"summary":"IT之家 1 月 17 日消息，研究公司 Snowflake 公布了一款名为“SwiftKV”的 AI 模型调校技术，并在 Hugging Face 开源三款利用“SwiftKV”技术进行调校的 Llama 3.1 AI 模型。IT之家获悉，SwiftKV 技术的核心在于优化模型提示词处理过程。实验结果表明，在利用 SwiftKV 技术优化 Llama 3.1 的 80 亿和 700 亿参数模型后，相应模型的整体吞吐量可提升两倍，同时相应模型也在代码自动补全、文本摘要等方面表现出色。","collect":0,"end_time":0,"defaultTopTitle":"ifeng.com","property":[],"viewcount":null,"language":"zh","relate_stocks":{"BK4503":"景林资产持仓","LU0109392836.USD":"富兰克林科技股A","BK4505":"高瓴资本持仓","BK4535":"淡马锡持仓","SNOW":"Snowflake","BK4559":"巴菲特持仓","LU1803068979.SGD":"FTIF - Franklin Technology A (acc) SGD-H1","BK4588":"碎股","BK4548":"巴美列捷福持仓","LU1923623000.USD":"Natixis Thematics AI & Robotics Fund R/A USD","BK4551":"寇图资本持仓","BK4585":"ETF&股票定投概念","LU0889565833.HKD":"FRANKLIN TECHNOLOGY \"A\" (HKD) ACC","LU1951200564.SGD":"Natixis Thematics AI & Robotics Fund R/A SGD","BK4554":"元宇宙及AR概念","BK4532":"文艺复兴科技持仓","BK4581":"高盛持仓","BK4116":"互联网服务与基础架构","LU1951198990.SGD":"Natixis Thematics AI & Robotics Fund H-R/A SGD-H"},"translate_title":"Research company announces SwiftKV technology to optimize the processing process of AI model prefabricated prompt words and reduce AI inference time by 50%","themeId":null,"isJumpTheme":false,"ttsUrl":null,"symbols_score_info":{"SNOW":0.9},"content_text":"IT之家 1 月 17 日消息，研究公司 Snowflake 公布了一款名为“SwiftKV”的 AI 模型调校技术，并在 Hugging Face 开源三款利用“SwiftKV”技术进行调校的 Llama 3.1 AI 模型（点此访问）。IT之家获悉，SwiftKV 技术的核心在于优化模型提示词处理过程。研究人员指出，通常情况下大模型最为消耗计算资源的环节在于处理用户为模型输入的提示词，而许多企业为模型自定义了极长的提示词，据称平均情况“大约是输出生成内容的 10 倍”。据 Snowflake 介绍，这一“SwiftKV”模型调校技术专门为相应预制的提示词处理进行优化，号称突破了传统的键值（Key-Value，KV）缓存压缩技术，还在模型推理过程中引入模型重组与知识保存自我蒸馏方法，从而有效提升模型吞吐量、降低了延迟和运算成本，号称可以帮助 AI 模型显著缩短推理时间，可以降低模型 50% 推论时间。实验结果表明，在利用 SwiftKV 技术优化 Llama 3.1 的 80 亿和 700 亿参数模型后，相应模型的整体吞吐量可提升两倍，同时相应模型也在代码自动补全、文本摘要等方面表现出色。","kind":"news","is_publish_news":true,"is_publish_highlight":false,"is_publish_live":false,"is_publish_wemedia":null,"editions":null,"column":"","sentiment":"0","news_tag":"","news_rank":0,"symbols":[],"gpt_button":0,"need_auth":false,"code":"91000000","status":"200"}}}