# WSL GPU Setup Guide Guide for configuring NVIDIA GPU support in WSL2 for the Local LLM Agent. ## Prerequisites | Requirement | Minimum Version | |-------------|-----------------| | Windows | Windows 11 (or Windows 10 21H2+) | | WSL | WSL2 | | NVIDIA Driver | 525.xx or newer | | GPU | NVIDIA with CUDA support | ## Quick Setup Run the automated setup script: ```bash # From WSL Ubuntu-24.04 cd /mnt/c/Empresas/ISEM/workspace-v2/projects/local-llm-agent chmod +x scripts/setup-wsl-gpu.sh ./scripts/setup-wsl-gpu.sh ``` ## Manual Setup ### Step 1: Verify Windows NVIDIA Driver On Windows, open PowerShell and run: ```powershell nvidia-smi ``` Expected output shows driver version >= 525.xx. If not, update from: https://www.nvidia.com/drivers ### Step 2: Update WSL ```powershell # From Windows PowerShell (Admin) wsl --update wsl --shutdown wsl -d Ubuntu-24.04 ``` ### Step 3: Verify GPU in WSL ```bash # From WSL nvidia-smi ``` You should see your GPU listed. If not, ensure: - Windows NVIDIA driver is installed - WSL is updated - WSL was restarted after driver installation ### Step 4: Install CUDA Toolkit ```bash # Add NVIDIA CUDA repository wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb rm cuda-keyring_1.1-1_all.deb # Install CUDA Toolkit 12.6 sudo apt-get update sudo apt-get install -y cuda-toolkit-12-6 # Add to PATH echo 'export PATH=/usr/local/cuda-12.6/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc # Verify nvcc --version ``` ### Step 5: Install Docker ```bash # Prerequisites sudo apt-get update sudo apt-get install -y ca-certificates curl gnupg # Add Docker GPG key sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg # Add repository echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null # Install Docker sudo apt-get update sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin # Add user to docker group sudo usermod -aG docker $USER # Log out and log back in, or: newgrp docker ``` ### Step 6: Install NVIDIA Container Toolkit ```bash # Add repository curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list # Install sudo apt-get update sudo apt-get install -y nvidia-container-toolkit # Configure Docker sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker ``` ### Step 7: Verify GPU in Docker ```bash docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi ``` Expected output: ``` +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 560.xx.xx Driver Version: 560.xx.xx CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX XXXX On | 00000000:01:00.0 On | N/A | | 30% 45C P8 15W / 200W | 1234MiB / 8192MiB | 0% Default | +-----------------------------------------+------------------------+----------------------+ ``` ## Troubleshooting ### GPU not visible in WSL 1. **Update Windows NVIDIA driver** - Download latest from https://www.nvidia.com/drivers - Restart Windows 2. **Update WSL** ```powershell wsl --update wsl --shutdown ``` 3. **Check WSL version** ```powershell wsl -l -v ``` Ensure Ubuntu-24.04 shows VERSION 2 ### Docker can't access GPU 1. **Restart Docker** ```bash sudo systemctl restart docker ``` 2. **Reconfigure NVIDIA runtime** ```bash sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker ``` 3. **Check Docker daemon config** ```bash cat /etc/docker/daemon.json ``` Should contain: ```json { "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "runtimeArgs": [] } } } ``` ### Out of Memory (OOM) errors 1. **Check GPU memory** ```bash nvidia-smi ``` 2. **Free up GPU memory** - Close other GPU applications - Reduce model size or batch size 3. **Configure WSL memory limit** Create/edit `%UserProfile%\.wslconfig`: ```ini [wsl2] memory=16GB processors=8 gpuSupport=true ``` ### CUDA version mismatch Ensure CUDA toolkit version matches driver support: | Driver Version | Max CUDA Version | |----------------|------------------| | >= 560.x | CUDA 12.6 | | >= 545.x | CUDA 12.3 | | >= 525.x | CUDA 12.0 | ## Hardware Requirements ### Minimum (Development) - GPU: NVIDIA GTX 1060 6GB - VRAM: 6GB - Models: TinyLlama, Phi-2 ### Recommended (Production) - GPU: NVIDIA RTX 3090 / RTX 4090 / A100 - VRAM: 24GB+ - Models: Llama-2-7B, Mistral-7B, CodeLlama-7B ### Model VRAM Requirements | Model | Parameters | Approx VRAM (FP16) | |-------|------------|-------------------| | TinyLlama | 1.1B | ~2GB | | Phi-2 | 2.7B | ~6GB | | Llama-2-7B | 7B | ~14GB | | Mistral-7B | 7B | ~14GB | | CodeLlama-13B | 13B | ~26GB | ## Next Steps After completing GPU setup: 1. Start the vLLM stack: ```bash docker-compose -f docker-compose.vllm.yml up -d ``` 2. Verify vLLM health: ```bash curl http://localhost:8000/health ``` 3. Test inference: ```bash curl http://localhost:3160/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"mistral","messages":[{"role":"user","content":"Hello"}]}' ``` ## References - [NVIDIA CUDA on WSL](https://docs.nvidia.com/cuda/wsl-user-guide/index.html) - [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) - [vLLM Documentation](https://docs.vllm.ai/) - [Docker GPU Support](https://docs.docker.com/config/containers/resource_constraints/#gpu)