287 lines
8.6 KiB
Markdown
287 lines
8.6 KiB
Markdown
# LLM Agent Service
|
|
|
|
AI-powered trading copilot with local GPU support for OrbiQuant IA Trading Platform.
|
|
|
|
## Overview
|
|
|
|
This service provides an intelligent trading agent that runs locally on GPU using **Ollama**, with support for:
|
|
|
|
- **Local LLM Inference:** Run Llama 3, Mistral, and other models on your GPU
|
|
- **Trading Analysis:** Technical analysis, market sentiment, AMD phase identification
|
|
- **Function Calling (Tools):** Get real-time data, ML signals, portfolio info
|
|
- **Educational Support:** Explain concepts, recommend learning paths
|
|
- **Risk Management:** Position sizing, stop loss recommendations
|
|
- **Streaming Responses:** Real-time chat with Server-Sent Events
|
|
|
|
## Why Local LLM?
|
|
|
|
- **Privacy:** Your trading conversations stay on your hardware
|
|
- **Cost:** No API costs after initial setup
|
|
- **Speed:** Low latency with local GPU
|
|
- **Customization:** Full control over model and prompts
|
|
- **Always Available:** No internet dependency for inference
|
|
|
|
## Technology Stack
|
|
|
|
- **Python**: 3.11+
|
|
- **LLM Provider**: Ollama (local GPU) + Claude/OpenAI fallback
|
|
- **Models**: Llama 3 8B (recommended), Mistral 7B, Mixtral 8x7b
|
|
- **API Framework**: FastAPI with SSE streaming
|
|
- **Tools System**: Custom function calling for trading operations
|
|
- **Database**: PostgreSQL (asyncpg)
|
|
- **Cache**: Redis
|
|
- **Testing**: pytest, pytest-asyncio
|
|
|
|
## Setup
|
|
|
|
### Prerequisites
|
|
|
|
- **GPU:** NVIDIA RTX 5060 Ti (16GB VRAM) or better
|
|
- **RAM:** 16GB minimum, 32GB recommended
|
|
- **OS:** Linux (Ubuntu 20.04+) or WSL2
|
|
- **Software:**
|
|
- Docker with NVIDIA Container Toolkit
|
|
- Python 3.11+
|
|
- Miniconda or Anaconda
|
|
|
|
### Quick Start
|
|
|
|
1. **Start Ollama:**
|
|
|
|
```bash
|
|
cd /home/isem/workspace/projects/trading-platform/apps/llm-agent
|
|
|
|
# Start Ollama with GPU support
|
|
docker-compose -f docker-compose.ollama.yml up -d
|
|
|
|
# Pull Llama 3 8B model (recommended for 16GB VRAM)
|
|
docker exec orbiquant-ollama ollama pull llama3:8b
|
|
```
|
|
|
|
2. **Configure Service:**
|
|
|
|
```bash
|
|
# Copy environment file
|
|
cp .env.example .env
|
|
|
|
# Edit configuration (LLM_PROVIDER=ollama is default)
|
|
nano .env
|
|
```
|
|
|
|
3. **Install Dependencies:**
|
|
|
|
```bash
|
|
# Create conda environment
|
|
conda env create -f environment.yml
|
|
conda activate orbiquant-llm-agent
|
|
|
|
# Or install with pip
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
4. **Start the Service:**
|
|
|
|
```bash
|
|
# Development mode with hot-reload
|
|
uvicorn src.main:app --reload --host 0.0.0.0 --port 8003
|
|
```
|
|
|
|
5. **Test it:**
|
|
|
|
Open in browser: http://localhost:8003/docs
|
|
|
|
```bash
|
|
# Or via curl
|
|
curl http://localhost:8003/api/v1/health
|
|
```
|
|
|
|
### Detailed Setup
|
|
|
|
See [DEPLOYMENT.md](./DEPLOYMENT.md) for:
|
|
- Complete installation guide
|
|
- GPU configuration
|
|
- Model selection
|
|
- Performance tuning
|
|
- Troubleshooting
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
llm-agent/
|
|
├── src/
|
|
│ ├── main.py # FastAPI application
|
|
│ ├── config.py # Configuration management
|
|
│ ├── core/ # Core LLM functionality
|
|
│ │ ├── llm_client.py # Ollama/Claude client
|
|
│ │ ├── prompt_manager.py # System prompts
|
|
│ │ └── context_manager.py # Conversation context
|
|
│ ├── tools/ # Trading tools (function calling)
|
|
│ │ ├── base.py # Tool base classes
|
|
│ │ ├── signals.py # ML signals & analysis
|
|
│ │ ├── portfolio.py # Portfolio management
|
|
│ │ ├── trading.py # Trading execution
|
|
│ │ └── education.py # Educational tools
|
|
│ ├── prompts/ # System prompts
|
|
│ │ ├── system.txt # Main trading copilot prompt
|
|
│ │ ├── analysis.txt # Analysis template
|
|
│ │ └── strategy.txt # Strategy template
|
|
│ ├── api/ # API routes
|
|
│ │ └── routes.py # All endpoints
|
|
│ ├── models/ # Pydantic models
|
|
│ ├── services/ # Business logic
|
|
│ └── repositories/ # Data access layer
|
|
├── tests/
|
|
├── docker-compose.ollama.yml # Ollama GPU setup
|
|
├── DEPLOYMENT.md # Detailed deployment guide
|
|
├── requirements.txt
|
|
├── environment.yml
|
|
└── .env.example
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
### Core Endpoints
|
|
- `GET /` - Service info and health
|
|
- `GET /api/v1/health` - Detailed health check with LLM status
|
|
- `GET /api/v1/models` - List available LLM models
|
|
|
|
### Chat & Analysis
|
|
- `POST /api/v1/chat` - Interactive chat (supports streaming)
|
|
- `POST /api/v1/analyze` - Comprehensive symbol analysis
|
|
- `POST /api/v1/strategy` - Generate trading strategy
|
|
- `POST /api/v1/explain` - Explain trading concepts
|
|
|
|
### Tools & Context
|
|
- `GET /api/v1/tools` - List available tools for user plan
|
|
- `DELETE /api/v1/context/{user_id}/{conversation_id}` - Clear conversation
|
|
|
|
See interactive docs at: http://localhost:8003/docs
|
|
|
|
## Development
|
|
|
|
### Code Quality
|
|
|
|
```bash
|
|
# Format code
|
|
black src/
|
|
isort src/
|
|
|
|
# Lint
|
|
flake8 src/
|
|
|
|
# Type checking
|
|
mypy src/
|
|
```
|
|
|
|
### Testing
|
|
|
|
```bash
|
|
# Run all tests
|
|
pytest
|
|
|
|
# With coverage
|
|
pytest --cov=src --cov-report=html
|
|
|
|
# Specific tests
|
|
pytest tests/unit/
|
|
```
|
|
|
|
## Trading Tools (Function Calling)
|
|
|
|
The agent has access to these tools:
|
|
|
|
### Market Data (Free)
|
|
- `get_analysis` - Current price, volume, 24h change
|
|
- `get_news` - Recent news with sentiment analysis
|
|
- `calculate_position_size` - Risk-based position sizing
|
|
|
|
### ML & Signals (Pro/Premium)
|
|
- `get_signal` - ML predictions with entry/exit levels
|
|
- `check_portfolio` - Portfolio overview and P&L
|
|
- `get_positions` - Detailed position information
|
|
- `get_trade_history` - Historical trades with metrics
|
|
|
|
### Trading (Pro/Premium)
|
|
- `execute_trade` - Execute paper trading orders
|
|
- `set_alert` - Create price alerts
|
|
|
|
### Education (Free)
|
|
- `explain_concept` - Explain trading terms (RSI, MACD, AMD, etc.)
|
|
- `get_course_info` - Recommend learning resources
|
|
|
|
Tools are automatically filtered based on user subscription plan (Free/Pro/Premium).
|
|
|
|
## System Prompt & Trading Philosophy
|
|
|
|
The agent is trained with a comprehensive system prompt that includes:
|
|
|
|
- **AMD Framework:** Identifies Accumulation, Manipulation, and Distribution phases
|
|
- **Risk Management:** Always prioritizes proper position sizing and stop losses
|
|
- **Educational Approach:** Explains the "why" behind every recommendation
|
|
- **Multi-timeframe Analysis:** Considers multiple timeframes for context
|
|
- **Data-Driven:** Uses tools to fetch real data, never invents prices
|
|
|
|
The agent will:
|
|
- ✅ Provide educational analysis with clear risk management
|
|
- ✅ Explain concepts in simple terms for all levels
|
|
- ✅ Use real market data via tools
|
|
- ✅ Warn against risky behavior
|
|
- ❌ NEVER give financial advice or guarantee returns
|
|
- ❌ NEVER invent market data or prices
|
|
|
|
## Architecture
|
|
|
|
Built following SOLID principles:
|
|
|
|
- **LLM Client Abstraction:** Unified interface for Ollama/Claude/OpenAI
|
|
- **Tool Registry:** Dynamic tool loading with permission checking
|
|
- **Context Manager:** Maintains conversation history efficiently
|
|
- **Prompt Manager:** Centralized prompt templates
|
|
- **Streaming Support:** Real-time responses with SSE
|
|
|
|
## Configuration
|
|
|
|
Key environment variables:
|
|
|
|
```bash
|
|
# LLM Provider
|
|
LLM_PROVIDER=ollama # ollama, claude, or openai
|
|
OLLAMA_BASE_URL=http://localhost:11434
|
|
LLM_MODEL=llama3:8b
|
|
|
|
# Service URLs
|
|
BACKEND_URL=http://localhost:8000
|
|
DATA_SERVICE_URL=http://localhost:8001
|
|
ML_ENGINE_URL=http://localhost:8002
|
|
|
|
# Optional: Claude fallback
|
|
# ANTHROPIC_API_KEY=sk-ant-xxx
|
|
|
|
# Database & Cache
|
|
DATABASE_URL=postgresql://...
|
|
REDIS_URL=redis://localhost:6379/0
|
|
```
|
|
|
|
## Model Recommendations
|
|
|
|
For **RTX 5060 Ti (16GB VRAM):**
|
|
|
|
| Model | Size | Speed | Quality | Best For |
|
|
|-------|------|-------|---------|----------|
|
|
| `llama3:8b` | 4.7GB | ⚡⚡⚡ | ⭐⭐⭐⭐ | **Recommended** - Best balance |
|
|
| `mistral:7b` | 4.1GB | ⚡⚡⚡⚡ | ⭐⭐⭐ | Fast responses, good quality |
|
|
| `llama3:70b` | 40GB+ | ⚡ | ⭐⭐⭐⭐⭐ | Requires 40GB+ VRAM |
|
|
| `mixtral:8x7b` | 26GB | ⚡⚡ | ⭐⭐⭐⭐ | Requires 32GB+ VRAM |
|
|
|
|
**Recommendation:** Start with `llama3:8b` for your hardware.
|
|
|
|
## Documentation
|
|
|
|
- [DEPLOYMENT.md](./DEPLOYMENT.md) - Complete deployment guide
|
|
- [API Documentation](http://localhost:8003/docs) - Interactive API docs
|
|
- [Specification Docs](/docs/02-definicion-modulos/OQI-007-llm-agent/) - Technical specifications
|
|
|
|
## License
|
|
|
|
Proprietary - OrbiQuant IA Trading Platform
|