DeepSeek AI’s Post

Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 https://lnkd.in/gurXyTVe ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗 https://lnkd.in/giPt92JG 📊 Analyze computation-communication overlap in V3/R1. 🔗 https://lnkd.in/gubSQfMP

Chiara Maria Cervetta

Florist | Volunteer Community Manager at Say, Pi | Google Local Guides Guiding Star on Google Maps in the Community Builder category

1mo

That's so very amazing, DeepSeek AI team! 🤩 🐳 🌟 ✨

Like
Reply

3 repo's in 1 day! thats cool!

Like
Reply
Junwei Li

MLE@Apple | LLM Sys | NeurIPS2024 | xMicrosoft

1mo

awesome!

Like
Reply
Mohammed Tanvir

AI & Cloud Innovator | Azure AI Engineer Associate | Expert in AI/ML Engineering, DevOps, Kubernetes, CI/CD, Terraform | Certified Cloud Architect (Azure, AWS, OCI) | Driving Transformation with AI & Cloud Technologies.

1mo

DualPipe’s bidirectional pipeline parallelism and EPLB’s expert parallel load balancing are game-changers for V3/R1 training optimizing, computation communication overlap like never before. These advancements not only boost efficiency but also push the boundaries of scalable AI. Kudos to the #DeepSeek team for driving progress and sharing these insights with the community. Let’s keep building the future.

Rais Kazi

Exploring true Value of AI | Mentor

1mo

Sounds like a smart technique to make algorithm "compute bound" for better utilization of GPU FLOPS for more efficient training.

Kiran Dugana

Developing AI medha for India

1mo

Interesting

Like
Reply

Major is AI

  • No alternative text description for this image
Like
Reply

whoever changes the algorithms in AI faster to fractal wave non-linear ones will get super AI

Like
Reply

Is R2 coming soon?

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics