# LLM Agent Service AI-powered trading copilot with local GPU support for OrbiQuant IA Trading Platform. ## Overview This service provides an intelligent trading agent that runs locally on GPU using **Ollama**, with support for: - **Local LLM Inference:** Run Llama 3, Mistral, and other models on your GPU - **Trading Analysis:** Technical analysis, market sentiment, AMD phase identification - **Function Calling (Tools):** Get real-time data, ML signals, portfolio info - **Educational Support:** Explain concepts, recommend learning paths - **Risk Management:** Position sizing, stop loss recommendations - **Streaming Responses:** Real-time chat with Server-Sent Events ## Why Local LLM? - **Privacy:** Your trading conversations stay on your hardware - **Cost:** No API costs after initial setup - **Speed:** Low latency with local GPU - **Customization:** Full control over model and prompts - **Always Available:** No internet dependency for inference ## Technology Stack - **Python**: 3.11+ - **LLM Provider**: Ollama (local GPU) + Claude/OpenAI fallback - **Models**: Llama 3 8B (recommended), Mistral 7B, Mixtral 8x7b - **API Framework**: FastAPI with SSE streaming - **Tools System**: Custom function calling for trading operations - **Database**: PostgreSQL (asyncpg) - **Cache**: Redis - **Testing**: pytest, pytest-asyncio ## Setup ### Prerequisites - **GPU:** NVIDIA RTX 5060 Ti (16GB VRAM) or better - **RAM:** 16GB minimum, 32GB recommended - **OS:** Linux (Ubuntu 20.04+) or WSL2 - **Software:** - Docker with NVIDIA Container Toolkit - Python 3.11+ - Miniconda or Anaconda ### Quick Start 1. **Start Ollama:** ```bash cd /home/isem/workspace/projects/trading-platform/apps/llm-agent # Start Ollama with GPU support docker-compose -f docker-compose.ollama.yml up -d # Pull Llama 3 8B model (recommended for 16GB VRAM) docker exec orbiquant-ollama ollama pull llama3:8b ``` 2. **Configure Service:** ```bash # Copy environment file cp .env.example .env # Edit configuration (LLM_PROVIDER=ollama is default) nano .env ``` 3. **Install Dependencies:** ```bash # Create conda environment conda env create -f environment.yml conda activate orbiquant-llm-agent # Or install with pip pip install -r requirements.txt ``` 4. **Start the Service:** ```bash # Development mode with hot-reload uvicorn src.main:app --reload --host 0.0.0.0 --port 8003 ``` 5. **Test it:** Open in browser: http://localhost:8003/docs ```bash # Or via curl curl http://localhost:8003/api/v1/health ``` ### Detailed Setup See [DEPLOYMENT.md](./DEPLOYMENT.md) for: - Complete installation guide - GPU configuration - Model selection - Performance tuning - Troubleshooting ## Project Structure ``` llm-agent/ ├── src/ │ ├── main.py # FastAPI application │ ├── config.py # Configuration management │ ├── core/ # Core LLM functionality │ │ ├── llm_client.py # Ollama/Claude client │ │ ├── prompt_manager.py # System prompts │ │ └── context_manager.py # Conversation context │ ├── tools/ # Trading tools (function calling) │ │ ├── base.py # Tool base classes │ │ ├── signals.py # ML signals & analysis │ │ ├── portfolio.py # Portfolio management │ │ ├── trading.py # Trading execution │ │ └── education.py # Educational tools │ ├── prompts/ # System prompts │ │ ├── system.txt # Main trading copilot prompt │ │ ├── analysis.txt # Analysis template │ │ └── strategy.txt # Strategy template │ ├── api/ # API routes │ │ └── routes.py # All endpoints │ ├── models/ # Pydantic models │ ├── services/ # Business logic │ └── repositories/ # Data access layer ├── tests/ ├── docker-compose.ollama.yml # Ollama GPU setup ├── DEPLOYMENT.md # Detailed deployment guide ├── requirements.txt ├── environment.yml └── .env.example ``` ## API Endpoints ### Core Endpoints - `GET /` - Service info and health - `GET /api/v1/health` - Detailed health check with LLM status - `GET /api/v1/models` - List available LLM models ### Chat & Analysis - `POST /api/v1/chat` - Interactive chat (supports streaming) - `POST /api/v1/analyze` - Comprehensive symbol analysis - `POST /api/v1/strategy` - Generate trading strategy - `POST /api/v1/explain` - Explain trading concepts ### Tools & Context - `GET /api/v1/tools` - List available tools for user plan - `DELETE /api/v1/context/{user_id}/{conversation_id}` - Clear conversation See interactive docs at: http://localhost:8003/docs ## Development ### Code Quality ```bash # Format code black src/ isort src/ # Lint flake8 src/ # Type checking mypy src/ ``` ### Testing ```bash # Run all tests pytest # With coverage pytest --cov=src --cov-report=html # Specific tests pytest tests/unit/ ``` ## Trading Tools (Function Calling) The agent has access to these tools: ### Market Data (Free) - `get_analysis` - Current price, volume, 24h change - `get_news` - Recent news with sentiment analysis - `calculate_position_size` - Risk-based position sizing ### ML & Signals (Pro/Premium) - `get_signal` - ML predictions with entry/exit levels - `check_portfolio` - Portfolio overview and P&L - `get_positions` - Detailed position information - `get_trade_history` - Historical trades with metrics ### Trading (Pro/Premium) - `execute_trade` - Execute paper trading orders - `set_alert` - Create price alerts ### Education (Free) - `explain_concept` - Explain trading terms (RSI, MACD, AMD, etc.) - `get_course_info` - Recommend learning resources Tools are automatically filtered based on user subscription plan (Free/Pro/Premium). ## System Prompt & Trading Philosophy The agent is trained with a comprehensive system prompt that includes: - **AMD Framework:** Identifies Accumulation, Manipulation, and Distribution phases - **Risk Management:** Always prioritizes proper position sizing and stop losses - **Educational Approach:** Explains the "why" behind every recommendation - **Multi-timeframe Analysis:** Considers multiple timeframes for context - **Data-Driven:** Uses tools to fetch real data, never invents prices The agent will: - ✅ Provide educational analysis with clear risk management - ✅ Explain concepts in simple terms for all levels - ✅ Use real market data via tools - ✅ Warn against risky behavior - ❌ NEVER give financial advice or guarantee returns - ❌ NEVER invent market data or prices ## Architecture Built following SOLID principles: - **LLM Client Abstraction:** Unified interface for Ollama/Claude/OpenAI - **Tool Registry:** Dynamic tool loading with permission checking - **Context Manager:** Maintains conversation history efficiently - **Prompt Manager:** Centralized prompt templates - **Streaming Support:** Real-time responses with SSE ## Configuration Key environment variables: ```bash # LLM Provider LLM_PROVIDER=ollama # ollama, claude, or openai OLLAMA_BASE_URL=http://localhost:11434 LLM_MODEL=llama3:8b # Service URLs BACKEND_URL=http://localhost:8000 DATA_SERVICE_URL=http://localhost:8001 ML_ENGINE_URL=http://localhost:8002 # Optional: Claude fallback # ANTHROPIC_API_KEY=sk-ant-xxx # Database & Cache DATABASE_URL=postgresql://... REDIS_URL=redis://localhost:6379/0 ``` ## Model Recommendations For **RTX 5060 Ti (16GB VRAM):** | Model | Size | Speed | Quality | Best For | |-------|------|-------|---------|----------| | `llama3:8b` | 4.7GB | ⚡⚡⚡ | ⭐⭐⭐⭐ | **Recommended** - Best balance | | `mistral:7b` | 4.1GB | ⚡⚡⚡⚡ | ⭐⭐⭐ | Fast responses, good quality | | `llama3:70b` | 40GB+ | ⚡ | ⭐⭐⭐⭐⭐ | Requires 40GB+ VRAM | | `mixtral:8x7b` | 26GB | ⚡⚡ | ⭐⭐⭐⭐ | Requires 32GB+ VRAM | **Recommendation:** Start with `llama3:8b` for your hardware. ## Documentation - [DEPLOYMENT.md](./DEPLOYMENT.md) - Complete deployment guide - [API Documentation](http://localhost:8003/docs) - Interactive API docs - [Specification Docs](/docs/02-definicion-modulos/OQI-007-llm-agent/) - Technical specifications ## License Proprietary - OrbiQuant IA Trading Platform