Hugging Face reposted this
Initializing each model in a DiffusionPipeline & then quantizing them to perform full inference is a painful dev experience 🥲 Today, we're shipping something you'll love -- pipeline-level quantization. Pass a quant config directly while `DiffusionPipeline.from_pretrained()` 🔥 The example above is the easiest entry point. But maybe you want more flexibility. For example, have different quantization configs for different components. Specify configs, not entire models. You retain all the flexibility, but with more ease 🤗 📜 Docs: https://lnkd.in/gMHKePEz 👨💻 PR: https://lnkd.in/gixRmKc6