148 lines
3.6 KiB
Markdown
148 lines
3.6 KiB
Markdown
# LoRA Adapters
|
|
|
|
This directory contains LoRA (Low-Rank Adaptation) adapters for project-specific fine-tuning.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
lora-adapters/
|
|
├── README.md # This file
|
|
├── erp-core/ # ERP Core domain adapter
|
|
│ ├── adapter_config.json
|
|
│ └── adapter_model.safetensors
|
|
├── trading/ # Trading platform adapter
|
|
│ ├── adapter_config.json
|
|
│ └── adapter_model.safetensors
|
|
└── {project-name}/ # Additional project adapters
|
|
├── adapter_config.json
|
|
└── adapter_model.safetensors
|
|
```
|
|
|
|
## Creating LoRA Adapters
|
|
|
|
### Prerequisites
|
|
|
|
- Base model: `mistralai/Mistral-7B-Instruct-v0.2` (or compatible)
|
|
- Training data in JSONL format
|
|
- PEFT library for training
|
|
|
|
### Training Example
|
|
|
|
```python
|
|
from peft import LoraConfig, get_peft_model
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
# Load base model
|
|
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
|
|
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
|
|
|
|
# Configure LoRA
|
|
lora_config = LoraConfig(
|
|
r=64, # Rank
|
|
lora_alpha=128, # Alpha scaling
|
|
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
|
|
lora_dropout=0.05,
|
|
bias="none",
|
|
task_type="CAUSAL_LM"
|
|
)
|
|
|
|
# Apply LoRA
|
|
model = get_peft_model(model, lora_config)
|
|
|
|
# Train...
|
|
# Save adapter
|
|
model.save_pretrained("lora-adapters/your-adapter")
|
|
```
|
|
|
|
### Required Files
|
|
|
|
1. **adapter_config.json**: LoRA configuration
|
|
```json
|
|
{
|
|
"base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.2",
|
|
"peft_type": "LORA",
|
|
"task_type": "CAUSAL_LM",
|
|
"r": 64,
|
|
"lora_alpha": 128,
|
|
"lora_dropout": 0.05,
|
|
"target_modules": ["q_proj", "k_proj", "v_proj", "o_proj"]
|
|
}
|
|
```
|
|
|
|
2. **adapter_model.safetensors**: LoRA weights
|
|
|
|
## Using LoRA Adapters with vLLM
|
|
|
|
### Configuration
|
|
|
|
Adapters are automatically mounted in the vLLM container:
|
|
|
|
```yaml
|
|
# docker-compose.vllm.yml
|
|
volumes:
|
|
- ./lora-adapters:/lora-adapters:ro
|
|
```
|
|
|
|
### API Usage
|
|
|
|
```bash
|
|
# Chat with LoRA adapter
|
|
curl -X POST http://localhost:3160/v1/chat/completions \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "mistralai/Mistral-7B-Instruct-v0.2",
|
|
"messages": [
|
|
{"role": "user", "content": "How do I create an invoice in the ERP system?"}
|
|
],
|
|
"lora_adapter": "erp-core"
|
|
}'
|
|
```
|
|
|
|
### Listing Available Adapters
|
|
|
|
```bash
|
|
# List LoRA adapters
|
|
curl http://localhost:3160/v1/lora/adapters
|
|
```
|
|
|
|
## Project-Specific Adapters
|
|
|
|
### erp-core
|
|
|
|
- **Purpose**: ERP domain knowledge (invoices, inventory, accounting)
|
|
- **Training data**: ERP documentation, code, user interactions
|
|
- **Base model**: Mistral-7B-Instruct
|
|
|
|
### trading
|
|
|
|
- **Purpose**: Trading platform domain (orders, positions, market data)
|
|
- **Training data**: Trading documentation, API specs, user queries
|
|
- **Base model**: Mistral-7B-Instruct
|
|
|
|
## Best Practices
|
|
|
|
1. **Keep adapters small**: LoRA adapters should be < 100MB
|
|
2. **Test locally first**: Verify adapter loads correctly
|
|
3. **Version control**: Track adapter versions separately
|
|
4. **Documentation**: Document training data and hyperparameters
|
|
|
|
## Troubleshooting
|
|
|
|
### Adapter not loading
|
|
|
|
1. Check file permissions
|
|
2. Verify `adapter_config.json` matches base model
|
|
3. Check vLLM logs: `docker logs local-llm-vllm`
|
|
|
|
### Memory issues
|
|
|
|
1. Reduce `max_loras` in docker-compose
|
|
2. Use smaller LoRA rank (r=32 instead of r=64)
|
|
3. Enable LoRA merging for inference
|
|
|
|
## References
|
|
|
|
- [PEFT Library](https://github.com/huggingface/peft)
|
|
- [vLLM LoRA Support](https://docs.vllm.ai/en/latest/models/lora.html)
|
|
- [LoRA Paper](https://arxiv.org/abs/2106.09685)
|