Skip to content

Latest commit

 

History

History
20 lines (10 loc) · 808 Bytes

2024-03-08.md

File metadata and controls

20 lines (10 loc) · 808 Bytes

1.田渊栋等人新作:突破内存瓶颈,让一块4090预训练7B大模型

该研究首次证明了在具有 24GB 内存的消费级 GPU(例如 NVIDIA RTX 4090)上预训练 7B 模型的可行性,无需模型并行、检查点或卸载策略。

论文地址:https://arxiv.org/abs/2403.03507

论文标题:GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

2.Hugging Face 启动开源机器人项目,由前特斯拉科学家领导

https://venturebeat.com/ai/hugging-face-is-launching-an-open-source-robotics-project-led-by-former-tesla-scientist/

3.用Claude3 验证商业创意的提示

https://twitter.com/mattshumer_/status/1765822278351143113?s=20

4.基于大模型多智能体的论文思考总结

https://github.com/taichengguo/LLM_MultiAgents_Survey_Papers