Unsloth AI

Unsloth AI · 2025-01-27T15:28:51.035Z

Introducing 1.58bit DeepSeek-R1 GGUFs! 🐋 R1 can now run in 1.58-bit, while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction. Naively quantizing all layers breaks the model entirely, causing endless loops & gibberish outputs. Our dynamic quants solve this. The 1.58-bit quant fits in 160GB VRAM (2x H100 80GB) for fast inference at ~140 tokens/sec for throughput. By studying DeepSeek AI's R1 architecture, we selectively quantized certain layers to higher bits (like 4-bit), and leave most MoE layers to 1.5-bit. Benchmarks + Blog: https://lnkd.in/g5uA3855 Dynamic GGUFs (131GB–212GB) on Hugging Face: https://lnkd.in/gP7ysgfe

Technology, Information and Internet

Sans Fransisco, California 11,799 followers

Making AI accessible for everyone! 🦥

See jobs Follow

View all 4 employees

About us

Easily finetune & train LLMs. Get faster with unsloth.

Website: https://unsloth.ai
External link for Unsloth AI
Industry: Technology, Information and Internet
Company size: 2-10 employees
Headquarters: Sans Fransisco, California
Type: Privately Held
Founded: 2023
Specialties: artificial intelligence, ai, llms, language models, and finetuning

Locations

Primary

Sans Fransisco, California 94107, US

Get directions

Employees at Unsloth AI

See all employees

Updates

Unsloth AI

11,799 followers
1w
Report this post
You can now Run DeepSeek-V3-0324 locally using our 2.71-bit Dynamic GGUF! We shrank 720GB to 231GB (-70%) by selectively quantizing layers. 2.71bit passes many code tests, producing nearly identical results to the full 8bit model. Guide + examples: https://lnkd.in/g8_D9-fz Model upload: https://lnkd.in/gUfp5-xm
16 Comments

Like Comment Share
Unsloth AI reposted this
Daniel Han

unsloth.ai - open-source AI training
2w Edited
Report this post
We teamed up with Hugging Face to release a free GRPO notebook that fine-tunes Gemma 3 into a powerful reasoning model! Using Unsloth AI, OpenAI’s math dataset and custom reward functions, we fine-tune Google’s Gemma 3 (1B) to generate chain-of-thought reasoning. Free Colab Notebook: https://lnkd.in/e94SKJz4 Summary of what you'll learn: • Implement chain-of-thought reasoning in Google's Gemma 3 (1B) using 16-bit LoRA • Make tiny LLMs benefit from GRPO • Understand reward functions • Prepare your data + evaluate your LLM Join HF's Course: https://lnkd.in/e_PhX4tc Thank you Ben Burtenshaw for being patient and working with us on this collab! 🤗
75 Comments

Like Comment Share
Unsloth AI reposted this
Ben Burtenshaw

Machine Learning Advocacy @ 🤗 Hugging Face
3w
Report this post
The unit we’re all waiting for is here! Unsloth AI + Hugging Face on GRPO in the reasoning course. 🔗 https://lnkd.in/enr3adQ5 In this unit, you’ll build on the earlier units by implementing GRPO in Unsloth, this time we’re also levelling things up: - run on limited hardware with unsloth optimizations - expand GRPO reward functions to format and beyond - explore a wider range of model sizes up to 7B This should help way more students without serious hardware. Can’t wait to hear how it goes. Follow the org to join in: https://lnkd.in/enr3adQ5
14 Comments

Like Comment Share
Unsloth AI

11,799 followers
1mo
Report this post
Unsloth now works on Windows! 🦥 Fine-tune LLMs locally on Windows without Linux or WSL. Just install prerequisites & run our pip command. Tutorial: https://lnkd.in/gWC4AcMV
14 Comments

Like Comment Share
Unsloth AI

11,799 followers
1mo Edited
Report this post
Tutorial: Train your own Reasoning LLM for free! Transform Llama 3.1 (8B) to have chain-of-thought using DeepSeek's GRPO algorithm. Unsloth makes GRPO use 90% less VRAM: https://docs.unsloth.ai/ You'll learn about: • Reward Functions + dataset prep • GRPO Basics + tips & tricks • Training on free Colab GPUs • Running + evaluating + saving your model Tutorial Link: https://lnkd.in/gxYGrFhd
12 Comments

Like Comment Share
Unsloth AI

11,799 followers
1mo
Report this post
Today, we’re launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO). Using Unsloth, you can now train your own reasoning model with just 5GB VRAM for Qwen2.5-1.5B with no accuracy loss. Blog: https://lnkd.in/gnvEjxMm Free Colab Notebook for Llama 3.1 (8B) GRPO: https://lnkd.in/g7deg5Uw For our benchmarks, a standard GRPO QLoRA setup (TRL + FA2) for Llama 3.1 (8B) at 20K context required 510.8GB VRAM. Unsloth’s GRPO algorithms reduces this to just 54.3GB. The 5GB VRAM requirement for Qwen2.5 (1.5B) is down from 7GB in our previous GRPO release two weeks ago!
34 Comments

Like Comment Share
Unsloth AI

11,799 followers
1mo
Report this post
You can now reproduce DeepSeek-R1's reasoning on your own local device! Introducing reasoning in Unsloth. You'll just need 7GB VRAM to experience your own "Aha" moment 100% locally or free on Colab. Unsloth makes GRPO RL use 80% less memory. With 15GB VRAM, you can convert Llama 3.1 (8B), Phi-4 (14B), Mistral (7B), or any model up to 15B parameters into reasoning models. Guide + Blog: https://lnkd.in/gdzMDsYF
20 Comments

Like Comment Share
Unsloth AI reposted this
Unsloth AI

11,799 followers
2mo Edited
Report this post
Introducing 1.58bit DeepSeek-R1 GGUFs! 🐋 R1 can now run in 1.58-bit, while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction. Naively quantizing all layers breaks the model entirely, causing endless loops & gibberish outputs. Our dynamic quants solve this. The 1.58-bit quant fits in 160GB VRAM (2x H100 80GB) for fast inference at ~140 tokens/sec for throughput. By studying DeepSeek AI's R1 architecture, we selectively quantized certain layers to higher bits (like 4-bit), and leave most MoE layers to 1.5-bit. Benchmarks + Blog: https://lnkd.in/g5uA3855 Dynamic GGUFs (131GB–212GB) on Hugging Face: https://lnkd.in/gP7ysgfe
19 Comments

Like Comment Share
Unsloth AI

11,799 followers
2mo Edited
Report this post
Introducing 1.58bit DeepSeek-R1 GGUFs! 🐋 R1 can now run in 1.58-bit, while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction. Naively quantizing all layers breaks the model entirely, causing endless loops & gibberish outputs. Our dynamic quants solve this. The 1.58-bit quant fits in 160GB VRAM (2x H100 80GB) for fast inference at ~140 tokens/sec for throughput. By studying DeepSeek AI's R1 architecture, we selectively quantized certain layers to higher bits (like 4-bit), and leave most MoE layers to 1.5-bit. Benchmarks + Blog: https://lnkd.in/g5uA3855 Dynamic GGUFs (131GB–212GB) on Hugging Face: https://lnkd.in/gP7ysgfe
19 Comments

Like Comment Share
Unsloth AI reposted this
Vaibhav Srivastav

GPU poor @ Hugging Face
2mo
Report this post
running Phi 4 w/ Ollama & Unsloth AI on Mac, 100% local and fully private! 🔥 ollama run hf. co/unsloth/phi-4-GGUF:Q8_0 that's it! 🤗

43 Comments

Like Comment Share

Browse jobs

Funding

Unsloth AI 4 total rounds

Last Round

Seed Oct 30, 2024

See more info on crunchbase

Unsloth AI

Technology, Information and Internet

Sans Fransisco, California 11,799 followers

Making AI accessible for everyone! 🦥

About us

Locations

Employees at Unsloth AI

Daniel Han

unsloth.ai - open-source AI training

Michael Han (Unsloth)

Currently building Unsloth AI. 🦥

Updates

Join now to see what you are missing

Similar pages

Moonshot AI

Melty

David AI

Anara (YC S24)

Void (YC S24)

Haystack Software

Storia AI

Hugging Face

Undermind

autarc (YC S24)

Browse jobs

Engineer jobs

Scientist jobs

Software Engineer jobs

Developer jobs

Machine Learning Engineer jobs

Manager jobs

Analyst jobs

Head of Analytics jobs

Intern jobs

Data Science Specialist jobs

Senior Data Analyst jobs

Chief Officer jobs

Senior Scientist jobs

Data Engineer jobs

Data Scientist jobs

User Experience Designer jobs

Principal Product Manager jobs

Area Manager jobs

Creative Director jobs

Statistician jobs

Funding