Deployment Guide#

You can deploy and run models powered by NIM on your preferred NVIDIA accelerated infrastructure. This section describes deployment instructions by various supported target environment:

Deploy on your own compute#

Using Docker, as described in Get Started
On Kubernetes, as described in NIM Operator for Kubernetes, Kubernetes Installation, and Deploy with Helm
Across multiple nodes, as described in Multi-Node Deployment
Air-gapped, as described in Air Gap Deployment

Additional deployment guides#

You can also deploy on other platforms:

NVIDIA NIM on WSL2 provides instructions on setting up and configuring to deploy on Windows PCs using Windows Subsystem for Linux (WSL). We recommend that you set NIM_RELAX_MEM_CONSTRAINTS=1 when you deploy with Docker on RTX GPUs to avoid high memory usage.
The NIM on KServe deployment guide provides step-by-step on how to deploy on KServe.
The NIM on AWS Elastic Kubernetes Service (EKS) deployment guide provides step-by-step instructions for deploying on AWS EKS.
The NIM on Azure Kubernetes Service (AKS) deployment guide provides step-by-step instructions for deploying AKS.
The NIM on Azure Machine Learning (AzureML) deployment guide provides step-by-step instructions for deploying AzureML using Azure CLI and Jupyter Notebook.
The NIM on AWS SageMaker deployment guide provides step-by-step instructions for deploying on AWS SageMaker using Jupyter Notebooks, Python CLI, and the shell.
The End to End LLM App development with Azure AI Studio, Prompt Flo, and NIMs deployment guide provides end-to-end LLM App development with Azure AI Studio, Prompt Flow, and NIMs.

Deploy as a managed endpoint service#

For developers and organizations who prefer a fully-managed, hosted solution, NIM can be deployed on hosting integration partners including Baseten, Fireworks AI, and Together AI.

Use hosting integration partners for NIM deployment