6.7 KiB
6.7 KiB
WSL GPU Setup Guide
Guide for configuring NVIDIA GPU support in WSL2 for the Local LLM Agent.
Prerequisites
| Requirement | Minimum Version |
|---|---|
| Windows | Windows 11 (or Windows 10 21H2+) |
| WSL | WSL2 |
| NVIDIA Driver | 525.xx or newer |
| GPU | NVIDIA with CUDA support |
Quick Setup
Run the automated setup script:
# From WSL Ubuntu-24.04
cd /mnt/c/Empresas/ISEM/workspace-v2/projects/local-llm-agent
chmod +x scripts/setup-wsl-gpu.sh
./scripts/setup-wsl-gpu.sh
Manual Setup
Step 1: Verify Windows NVIDIA Driver
On Windows, open PowerShell and run:
nvidia-smi
Expected output shows driver version >= 525.xx. If not, update from: https://www.nvidia.com/drivers
Step 2: Update WSL
# From Windows PowerShell (Admin)
wsl --update
wsl --shutdown
wsl -d Ubuntu-24.04
Step 3: Verify GPU in WSL
# From WSL
nvidia-smi
You should see your GPU listed. If not, ensure:
- Windows NVIDIA driver is installed
- WSL is updated
- WSL was restarted after driver installation
Step 4: Install CUDA Toolkit
# Add NVIDIA CUDA repository
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
rm cuda-keyring_1.1-1_all.deb
# Install CUDA Toolkit 12.6
sudo apt-get update
sudo apt-get install -y cuda-toolkit-12-6
# Add to PATH
echo 'export PATH=/usr/local/cuda-12.6/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
# Verify
nvcc --version
Step 5: Install Docker
# Prerequisites
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg
# Add Docker GPG key
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Add repository
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and log back in, or:
newgrp docker
Step 6: Install NVIDIA Container Toolkit
# Add repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
# Configure Docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Step 7: Verify GPU in Docker
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi
Expected output:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.xx.xx Driver Version: 560.xx.xx CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX XXXX On | 00000000:01:00.0 On | N/A |
| 30% 45C P8 15W / 200W | 1234MiB / 8192MiB | 0% Default |
+-----------------------------------------+------------------------+----------------------+
Troubleshooting
GPU not visible in WSL
-
Update Windows NVIDIA driver
- Download latest from https://www.nvidia.com/drivers
- Restart Windows
-
Update WSL
wsl --update wsl --shutdown -
Check WSL version
wsl -l -vEnsure Ubuntu-24.04 shows VERSION 2
Docker can't access GPU
-
Restart Docker
sudo systemctl restart docker -
Reconfigure NVIDIA runtime
sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker -
Check Docker daemon config
cat /etc/docker/daemon.jsonShould contain:
{ "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "runtimeArgs": [] } } }
Out of Memory (OOM) errors
-
Check GPU memory
nvidia-smi -
Free up GPU memory
- Close other GPU applications
- Reduce model size or batch size
-
Configure WSL memory limit Create/edit
%UserProfile%\.wslconfig:[wsl2] memory=16GB processors=8 gpuSupport=true
CUDA version mismatch
Ensure CUDA toolkit version matches driver support:
| Driver Version | Max CUDA Version |
|---|---|
| >= 560.x | CUDA 12.6 |
| >= 545.x | CUDA 12.3 |
| >= 525.x | CUDA 12.0 |
Hardware Requirements
Minimum (Development)
- GPU: NVIDIA GTX 1060 6GB
- VRAM: 6GB
- Models: TinyLlama, Phi-2
Recommended (Production)
- GPU: NVIDIA RTX 3090 / RTX 4090 / A100
- VRAM: 24GB+
- Models: Llama-2-7B, Mistral-7B, CodeLlama-7B
Model VRAM Requirements
| Model | Parameters | Approx VRAM (FP16) |
|---|---|---|
| TinyLlama | 1.1B | ~2GB |
| Phi-2 | 2.7B | ~6GB |
| Llama-2-7B | 7B | ~14GB |
| Mistral-7B | 7B | ~14GB |
| CodeLlama-13B | 13B | ~26GB |
Next Steps
After completing GPU setup:
-
Start the vLLM stack:
docker-compose -f docker-compose.vllm.yml up -d -
Verify vLLM health:
curl http://localhost:8000/health -
Test inference:
curl http://localhost:3160/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"mistral","messages":[{"role":"user","content":"Hello"}]}'