微软开源创新框架:可将DeepSeek,变成AI Agent

美港电讯
17 Feb

【微软开源创新框架:可将DeepSeek,变成AI Agent】金十数据2月17日讯,微软在官网发布了视觉Agent解析框架OmniParser最新版本V2.0,可将DeepSeek-R1、GPT-4o、Qwen-2.5VL等模型,变成可在计算机使用的AI Agent。与V1版本相比,V2在检测较小的可交互UI元素时准确率更高、推理速度更快,延迟降低了60%。在高分辨率Agent基准测试ScreenSpot Pro中,V2+GPT-4o的准确率达到了惊人的39.6%,而GPT-4o原始准确率只有0.8%,整体提升非常大。除了V2,微软还开源了omnitool,这是一个基于Docker的 Windows 系统,涵盖屏幕理解、定位、动作规划和执行等功能,也是将大模型变成Agent的关键工具。

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10