# LLM Agent Service

AI-powered trading copilot with local GPU support for OrbiQuant IA Trading Platform.

## Overview

This service provides an intelligent trading agent that runs locally on GPU using **Ollama**, with support for:

- **Local LLM Inference:** Run Llama 3, Mistral, and other models on your GPU
- **Trading Analysis:** Technical analysis, market sentiment, AMD phase identification
- **Function Calling (Tools):** Get real-time data, ML signals, portfolio info
- **Educational Support:** Explain concepts, recommend learning paths
- **Risk Management:** Position sizing, stop loss recommendations
- **Streaming Responses:** Real-time chat with Server-Sent Events

## Why Local LLM?

- **Privacy:** Your trading conversations stay on your hardware
- **Cost:** No API costs after initial setup
- **Speed:** Low latency with local GPU
- **Customization:** Full control over model and prompts
- **Always Available:** No internet dependency for inference

## Technology Stack

- **Python**: 3.11+
- **LLM Provider**: Ollama (local GPU) + Claude/OpenAI fallback
- **Models**: Llama 3 8B (recommended), Mistral 7B, Mixtral 8x7b
- **API Framework**: FastAPI with SSE streaming
- **Tools System**: Custom function calling for trading operations
- **Database**: PostgreSQL (asyncpg)
- **Cache**: Redis
- **Testing**: pytest, pytest-asyncio

## Setup

### Prerequisites

- **GPU:** NVIDIA RTX 5060 Ti (16GB VRAM) or better
- **RAM:** 16GB minimum, 32GB recommended
- **OS:** Linux (Ubuntu 20.04+) or WSL2
- **Software:**
  - Docker with NVIDIA Container Toolkit
  - Python 3.11+
  - Miniconda or Anaconda

### Quick Start

1. **Start Ollama:**

```bash
cd /home/isem/workspace/projects/trading-platform/apps/llm-agent

# Start Ollama with GPU support
docker-compose -f docker-compose.ollama.yml up -d

# Pull Llama 3 8B model (recommended for 16GB VRAM)
docker exec orbiquant-ollama ollama pull llama3:8b
```

2. **Configure Service:**

```bash
# Copy environment file
cp .env.example .env

# Edit configuration (LLM_PROVIDER=ollama is default)
nano .env
```

3. **Install Dependencies:**

```bash
# Create conda environment
conda env create -f environment.yml
conda activate orbiquant-llm-agent

# Or install with pip
pip install -r requirements.txt
```

4. **Start the Service:**

```bash
# Development mode with hot-reload
uvicorn src.main:app --reload --host 0.0.0.0 --port 8003
```

5. **Test it:**

Open in browser: http://localhost:8003/docs

```bash
# Or via curl
curl http://localhost:8003/api/v1/health
```

### Detailed Setup

See [DEPLOYMENT.md](./DEPLOYMENT.md) for:
- Complete installation guide
- GPU configuration
- Model selection
- Performance tuning
- Troubleshooting

## Project Structure

```
llm-agent/
├── src/
│   ├── main.py                    # FastAPI application
│   ├── config.py                  # Configuration management
│   ├── core/                      # Core LLM functionality
│   │   ├── llm_client.py          # Ollama/Claude client
│   │   ├── prompt_manager.py      # System prompts
│   │   └── context_manager.py     # Conversation context
│   ├── tools/                     # Trading tools (function calling)
│   │   ├── base.py                # Tool base classes
│   │   ├── signals.py             # ML signals & analysis
│   │   ├── portfolio.py           # Portfolio management
│   │   ├── trading.py             # Trading execution
│   │   └── education.py           # Educational tools
│   ├── prompts/                   # System prompts
│   │   ├── system.txt             # Main trading copilot prompt
│   │   ├── analysis.txt           # Analysis template
│   │   └── strategy.txt           # Strategy template
│   ├── api/                       # API routes
│   │   └── routes.py              # All endpoints
│   ├── models/                    # Pydantic models
│   ├── services/                  # Business logic
│   └── repositories/              # Data access layer
├── tests/
├── docker-compose.ollama.yml      # Ollama GPU setup
├── DEPLOYMENT.md                  # Detailed deployment guide
├── requirements.txt
├── environment.yml
└── .env.example
```

## API Endpoints

### Core Endpoints
- `GET /` - Service info and health
- `GET /api/v1/health` - Detailed health check with LLM status
- `GET /api/v1/models` - List available LLM models

### Chat & Analysis
- `POST /api/v1/chat` - Interactive chat (supports streaming)
- `POST /api/v1/analyze` - Comprehensive symbol analysis
- `POST /api/v1/strategy` - Generate trading strategy
- `POST /api/v1/explain` - Explain trading concepts

### Tools & Context
- `GET /api/v1/tools` - List available tools for user plan
- `DELETE /api/v1/context/{user_id}/{conversation_id}` - Clear conversation

See interactive docs at: http://localhost:8003/docs

## Development

### Code Quality

```bash
# Format code
black src/
isort src/

# Lint
flake8 src/

# Type checking
mypy src/
```

### Testing

```bash
# Run all tests
pytest

# With coverage
pytest --cov=src --cov-report=html

# Specific tests
pytest tests/unit/
```

## Trading Tools (Function Calling)

The agent has access to these tools:

### Market Data (Free)
- `get_analysis` - Current price, volume, 24h change
- `get_news` - Recent news with sentiment analysis
- `calculate_position_size` - Risk-based position sizing

### ML & Signals (Pro/Premium)
- `get_signal` - ML predictions with entry/exit levels
- `check_portfolio` - Portfolio overview and P&L
- `get_positions` - Detailed position information
- `get_trade_history` - Historical trades with metrics

### Trading (Pro/Premium)
- `execute_trade` - Execute paper trading orders
- `set_alert` - Create price alerts

### Education (Free)
- `explain_concept` - Explain trading terms (RSI, MACD, AMD, etc.)
- `get_course_info` - Recommend learning resources

Tools are automatically filtered based on user subscription plan (Free/Pro/Premium).

## System Prompt & Trading Philosophy

The agent is trained with a comprehensive system prompt that includes:

- **AMD Framework:** Identifies Accumulation, Manipulation, and Distribution phases
- **Risk Management:** Always prioritizes proper position sizing and stop losses
- **Educational Approach:** Explains the "why" behind every recommendation
- **Multi-timeframe Analysis:** Considers multiple timeframes for context
- **Data-Driven:** Uses tools to fetch real data, never invents prices

The agent will:
- ✅ Provide educational analysis with clear risk management
- ✅ Explain concepts in simple terms for all levels
- ✅ Use real market data via tools
- ✅ Warn against risky behavior
- ❌ NEVER give financial advice or guarantee returns
- ❌ NEVER invent market data or prices

## Architecture

Built following SOLID principles:

- **LLM Client Abstraction:** Unified interface for Ollama/Claude/OpenAI
- **Tool Registry:** Dynamic tool loading with permission checking
- **Context Manager:** Maintains conversation history efficiently
- **Prompt Manager:** Centralized prompt templates
- **Streaming Support:** Real-time responses with SSE

## Configuration

Key environment variables:

```bash
# LLM Provider
LLM_PROVIDER=ollama              # ollama, claude, or openai
OLLAMA_BASE_URL=http://localhost:11434
LLM_MODEL=llama3:8b

# Service URLs
BACKEND_URL=http://localhost:8000
DATA_SERVICE_URL=http://localhost:8001
ML_ENGINE_URL=http://localhost:8002

# Optional: Claude fallback
# ANTHROPIC_API_KEY=sk-ant-xxx

# Database & Cache
DATABASE_URL=postgresql://...
REDIS_URL=redis://localhost:6379/0
```

## Model Recommendations

For **RTX 5060 Ti (16GB VRAM):**

| Model | Size | Speed | Quality | Best For |
|-------|------|-------|---------|----------|
| `llama3:8b` | 4.7GB | ⚡⚡⚡ | ⭐⭐⭐⭐ | **Recommended** - Best balance |
| `mistral:7b` | 4.1GB | ⚡⚡⚡⚡ | ⭐⭐⭐ | Fast responses, good quality |
| `llama3:70b` | 40GB+ | ⚡ | ⭐⭐⭐⭐⭐ | Requires 40GB+ VRAM |
| `mixtral:8x7b` | 26GB | ⚡⚡ | ⭐⭐⭐⭐ | Requires 32GB+ VRAM |

**Recommendation:** Start with `llama3:8b` for your hardware.

## Documentation

- [DEPLOYMENT.md](./DEPLOYMENT.md) - Complete deployment guide
- [API Documentation](http://localhost:8003/docs) - Interactive API docs
- [Specification Docs](/docs/02-definicion-modulos/OQI-007-llm-agent/) - Technical specifications

## License

Proprietary - OrbiQuant IA Trading Platform