DeepSeek目标是AGI,不是做空英伟达

信息平权
24 Feb

大家更关心的事情,下一个是什么?其实参考这次的FlashMLA,猜测方向大概就是过去已发表论文中的代码实现。不复杂,核心思想大概就是:“论文你们都学不会,ok,代码直接扔你们...”V3/r1中的招太多了,MTP多token预测、混合精度FP8训练、Dualpipe双重流水线训练、Long CoT长思维链、还有一些底层通信算子的优化。这些原始“手稿”或代码,可能就是后面4天的内容之一。转头一想,这...

Source Link

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10