From the course: Responsible Generative AI and Local LLMs
Unlock this course with a free trial
Join today to access over 24,800 courses taught by industry experts.
Deploying LLMs with Lorax and SkyPilot
From the course: Responsible Generative AI and Local LLMs
Deploying LLMs with Lorax and SkyPilot
- [Instructor] One of the big problems that you face when you're working with large language models is how do you deploy those? And one of the solutions here is LoRAX. And if we take a look at the documentation here from Predibase, it's actually surprisingly simple. You can see that you can run a container with a base large language model. In this case we're going to do Mistral and then you could choose the GPUs. And then that's it. You're really ready to go. And you can see here there's some examples of how to prompt it. So this is pretty compelling. And what's also nice about this is it's actually built in Rust and it's got great inference performance. Now, I think one of the easier ways to deploy this is to actually use a cloud-based GPU. So if we go to SkyPilot here, right, and we click on it, all we have to do is do a PIP install and then make sure that it's ready to go. And then it's actually surprisingly simple because all you need to do is actually paste in a YAML file. This…
Contents
-
-
Coding ELO in Python4m 7s
-
Coding ELO in Rust3m 49s
-
Coding ELO in R3m 31s
-
(Locked)
Coding ELO in Julia3m 5s
-
(Locked)
Profit sharing concepts5m 40s
-
(Locked)
Tragedy of the commons4m
-
(Locked)
Deploying LLMs with Lorax and SkyPilot3m 56s
-
(Locked)
Fine-tune Mistral and Ludwig3m 22s
-
(Locked)
Game theory in generative AI4m 45s
-
(Locked)
Perfect competition2m 45s
-
(Locked)
Negative externalities3m 23s
-
(Locked)
Regulatory entrepreneurship4m 18s
-
(Locked)
Creating reinforcement bias3m 59s
-
(Locked)
Getting started with Mozilla llamafile3m 36s
-
(Locked)
Developing cosmopolitan4m 29s
-
(Locked)
Building blocks for generative AI with whisper.cpp2m 53s
-
(Locked)
Transcribing with Whisper2m 56s
-
(Locked)
Portable phrase CLI3m 34s
-
(Locked)
Candle hello world2m 56s
-
(Locked)
Exploring StarCoder in Rust5m 54s
-
(Locked)
Whisper Candle transcriber5m 51s
-
(Locked)
Local system metrics3m 4s
-
(Locked)
Exploring remote development on AWS2m 15s
-
(Locked)
Rust for large language models (LLMs)2m
-
(Locked)
The continuous build binary2m 6s
-
(Locked)
Serverless inference1m 56s
-
(Locked)
Rust CLI inference2m 7s
-
(Locked)
Rust chat inference2m 3s
-
(Locked)
The chat loop with StarCoder2m 4s
-
(Locked)
Invoke an LLM on an AWS G5 instance, part 14m 36s
-
(Locked)
Invoke an LLM on an AWS G5 instance, part 22m 58s
-