trading-platform/docs/01-arquitectura/INTEGRACION-LLM-LOCAL.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

39 KiB

id title type project version updated_date
INTEGRACION-LLM-LOCAL Integracion LLM Local - chatgpt-oss 16GB Documentation trading-platform 1.0.0 2026-01-04

Integracion LLM Local - chatgpt-oss 16GB

Version: 1.0.0 Fecha: 2025-12-08 Modulo: OQI-007-llm-agent Autor: Trading Strategist - Trading Platform


Tabla de Contenidos

  1. Vision General
  2. Especificaciones de Hardware
  3. Arquitectura de Integracion
  4. Configuracion del Modelo
  5. Trading Tools
  6. System Prompt
  7. API Endpoints
  8. Context Management
  9. Implementacion
  10. Testing y Validacion

Vision General

Objetivo

Integrar un modelo LLM local (chatgpt-oss o equivalente) que funcione como copiloto de trading, capaz de:

  1. Analizar senales ML y explicarlas en lenguaje natural
  2. Tomar decisiones de trading basadas en contexto
  3. Ejecutar operaciones via MetaTrader4
  4. Responder preguntas sobre mercados y estrategias
  5. Gestionar alertas y notificaciones

Arquitectura de Alto Nivel

┌─────────────────────────────────────────────────────────────────────────────┐
│                         LLM TRADING COPILOT                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  User Interface                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  Chat Interface / CLI / Telegram Bot                                │   │
│  └───────────────────────────────────┬─────────────────────────────────┘   │
│                                      │                                      │
│                                      ▼                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                    LLM Service (FastAPI)                            │   │
│  │  ┌─────────────────────────────────────────────────────────────────┐│   │
│  │  │  Request Handler                                                ││   │
│  │  │  - Parse user input                                             ││   │
│  │  │  - Load context from Redis                                      ││   │
│  │  │  - Build prompt                                                 ││   │
│  │  └──────────────────────────────────┬──────────────────────────────┘│   │
│  │                                     │                               │   │
│  │  ┌──────────────────────────────────▼──────────────────────────────┐│   │
│  │  │  LLM Engine (chatgpt-oss / Llama3)                              ││   │
│  │  │  ┌────────────────────────────────────────────────────────────┐ ││   │
│  │  │  │  Ollama / vLLM / llama.cpp                                 │ ││   │
│  │  │  │  - GPU: NVIDIA RTX 5060 Ti 16GB                            │ ││   │
│  │  │  │  - Context: 8K tokens                                      │ ││   │
│  │  │  │  - Response time: <3s                                      │ ││   │
│  │  │  └────────────────────────────────────────────────────────────┘ ││   │
│  │  └──────────────────────────────────┬──────────────────────────────┘│   │
│  │                                     │                               │   │
│  │  ┌──────────────────────────────────▼──────────────────────────────┐│   │
│  │  │  Tool Executor                                                  ││   │
│  │  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐           ││   │
│  │  │  │get_signal│ │ analyze  │ │ execute  │ │ portfolio│           ││   │
│  │  │  │          │ │ _market  │ │ _trade   │ │ _status  │           ││   │
│  │  │  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘           ││   │
│  │  │       │            │            │            │                  ││   │
│  │  └───────┼────────────┼────────────┼────────────┼──────────────────┘│   │
│  └──────────┼────────────┼────────────┼────────────┼────────────────────┘   │
│             │            │            │            │                        │
│             ▼            ▼            ▼            ▼                        │
│  ┌──────────────────────────────────────────────────────────────────────┐  │
│  │                        External Services                              │  │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │  │
│  │  │  ML Engine  │  │  PostgreSQL │  │ MetaTrader4 │  │    Redis    │  │  │
│  │  │  (signals)  │  │   (data)    │  │  (trading)  │  │  (context)  │  │  │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘  │  │
│  └──────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Especificaciones de Hardware

GPU Disponible

Especificacion Valor
Modelo NVIDIA RTX 5060 Ti
VRAM 16GB GDDR6X
CUDA Cores TBD
Tensor Cores Si

Modelos Compatibles

Modelo VRAM Req Context Velocidad Recomendacion
Llama 3 8B ~10GB 8K ~20 tok/s Recomendado
Mistral 7B ~8GB 8K ~25 tok/s Alternativa
Qwen2 7B ~9GB 32K ~22 tok/s Alternativa
Phi-3 Mini 3.8B ~4GB 4K ~40 tok/s Backup

Configuracion Optima

# Llama 3 8B Instruct
model:
  name: llama3:8b-instruct-q5_K_M
  quantization: Q5_K_M  # Balance calidad/VRAM
  context_length: 8192
  batch_size: 512

gpu:
  device: cuda:0
  memory_fraction: 0.85  # 13.6GB de 16GB
  offload: false

inference:
  temperature: 0.7
  top_p: 0.9
  max_tokens: 2048
  repeat_penalty: 1.1

Arquitectura de Integracion

Componentes

┌──────────────────────────────────────────────────────────────────────┐
│                    LLM SERVICE COMPONENTS                             │
├──────────────────────────────────────────────────────────────────────┤
│                                                                       │
│  1. LLM Runtime (Ollama)                                             │
│     ├── Model Server (GPU)                                           │
│     ├── REST API (:11434)                                            │
│     └── Model Management                                             │
│                                                                       │
│  2. Trading Service (FastAPI)                                        │
│     ├── Chat Endpoint (/api/chat)                                    │
│     ├── Tool Executor                                                │
│     └── Response Formatter                                           │
│                                                                       │
│  3. Context Manager (Redis)                                          │
│     ├── Conversation History                                         │
│     ├── Market Context                                               │
│     └── User Preferences                                             │
│                                                                       │
│  4. Trading Tools                                                    │
│     ├── get_ml_signal()                                              │
│     ├── analyze_market()                                             │
│     ├── execute_trade()                                              │
│     ├── get_portfolio()                                              │
│     └── set_alert()                                                  │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

Flujo de Request

User Message
      │
      ▼
┌─────────────────┐
│ FastAPI Handler │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     ┌─────────────────┐
│ Load Context    │◄───▶│     Redis       │
│ from Redis      │     └─────────────────┘
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Build Prompt    │
│ (system + hist  │
│  + user msg)    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     ┌─────────────────┐
│ LLM Inference   │◄───▶│ Ollama (GPU)    │
│                 │     └─────────────────┘
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Parse Response  │
│ Check for Tools │
└────────┬────────┘
         │
    ┌────┴────┐
    │ Tools?  │
    └────┬────┘
    Yes  │  No
    │    │
    ▼    ▼
┌───────┐ ┌───────────────┐
│Execute│ │Return Response│
│ Tools │ │               │
└───┬───┘ └───────────────┘
    │
    ▼
┌─────────────────┐
│ Format Results  │
│ Send to LLM     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Final Response  │
│ Save to Context │
└─────────────────┘

Configuracion del Modelo

Instalacion Ollama

# Instalar Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verificar GPU
ollama run --verbose llama3:8b-instruct-q5_K_M

# Descargar modelo
ollama pull llama3:8b-instruct-q5_K_M

# Verificar VRAM
nvidia-smi

Archivo de Configuracion

# config/llm_config.yaml

llm:
  provider: ollama
  model: llama3:8b-instruct-q5_K_M
  base_url: http://localhost:11434

  inference:
    temperature: 0.7
    top_p: 0.9
    max_tokens: 2048
    stop_sequences:
      - "</tool>"
      - "Human:"

  context:
    max_history: 10  # ultimos 10 mensajes
    include_market_context: true
    include_portfolio: true

redis:
  host: localhost
  port: 6379
  db: 0
  prefix: "llm:"
  ttl: 3600  # 1 hora

tools:
  timeout: 30  # segundos por tool
  retry_count: 3
  parallel_execution: true

logging:
  level: INFO
  file: logs/llm_service.log

Trading Tools

Tool Definitions

# tools/trading_tools.py

from typing import Dict, Any, Optional
import httpx

class TradingTools:
    """
    Trading tools disponibles para el LLM
    """

    def __init__(self, config: Dict):
        self.ml_engine_url = config['ml_engine_url']
        self.trading_url = config['trading_url']
        self.data_url = config['data_url']

    async def get_ml_signal(
        self,
        symbol: str,
        timeframe: str = "5m"
    ) -> Dict[str, Any]:
        """
        Obtiene senal ML actual para un simbolo

        Args:
            symbol: Par de trading (XAUUSD, EURUSD, etc.)
            timeframe: Timeframe (5m, 15m, 1h)

        Returns:
            {
                "action": "LONG" | "SHORT" | "HOLD",
                "confidence": 0.78,
                "entry_price": 2650.50,
                "stop_loss": 2645.20,
                "take_profit": 2661.10,
                "risk_reward": 2.0,
                "amd_phase": "accumulation",
                "killzone": "london_open",
                "reasoning": ["AMD: Accumulation (78%)", ...]
            }
        """
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{self.ml_engine_url}/api/signal",
                params={"symbol": symbol, "timeframe": timeframe}
            )
            return response.json()

    async def analyze_market(
        self,
        symbol: str
    ) -> Dict[str, Any]:
        """
        Analiza el estado actual del mercado

        Returns:
            {
                "symbol": "XAUUSD",
                "current_price": 2650.50,
                "amd_phase": "accumulation",
                "ict_context": {
                    "killzone": "london_open",
                    "ote_zone": "discount",
                    "score": 0.72
                },
                "key_levels": {
                    "resistance": [2660.00, 2675.50],
                    "support": [2640.00, 2625.00]
                },
                "recent_signals": [...],
                "trend": "bullish",
                "volatility": "medium"
            }
        """
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{self.ml_engine_url}/api/analysis",
                params={"symbol": symbol}
            )
            return response.json()

    async def execute_trade(
        self,
        symbol: str,
        action: str,
        size: float,
        stop_loss: float,
        take_profit: float,
        account_id: Optional[str] = None
    ) -> Dict[str, Any]:
        """
        Ejecuta una operacion de trading

        Args:
            symbol: Par de trading
            action: "BUY" o "SELL"
            size: Tamano de posicion (lots)
            stop_loss: Precio de stop loss
            take_profit: Precio de take profit
            account_id: ID de cuenta MT4 (opcional)

        Returns:
            {
                "success": true,
                "ticket": 123456,
                "executed_price": 2650.45,
                "slippage": 0.05,
                "message": "Order executed successfully"
            }
        """
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.trading_url}/api/trade",
                json={
                    "symbol": symbol,
                    "action": action,
                    "size": size,
                    "stop_loss": stop_loss,
                    "take_profit": take_profit,
                    "account_id": account_id
                }
            )
            return response.json()

    async def get_portfolio(
        self,
        account_id: Optional[str] = None
    ) -> Dict[str, Any]:
        """
        Obtiene estado del portfolio

        Returns:
            {
                "balance": 10000.00,
                "equity": 10150.00,
                "margin": 500.00,
                "free_margin": 9650.00,
                "positions": [
                    {
                        "ticket": 123456,
                        "symbol": "XAUUSD",
                        "type": "BUY",
                        "size": 0.1,
                        "open_price": 2640.00,
                        "current_price": 2650.50,
                        "profit": 105.00,
                        "stop_loss": 2630.00,
                        "take_profit": 2660.00
                    }
                ],
                "daily_pnl": 250.00,
                "weekly_pnl": 450.00
            }
        """
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{self.trading_url}/api/portfolio",
                params={"account_id": account_id}
            )
            return response.json()

    async def set_alert(
        self,
        symbol: str,
        condition: str,
        price: float,
        message: str
    ) -> Dict[str, Any]:
        """
        Configura una alerta de precio

        Args:
            symbol: Par de trading
            condition: "above" o "below"
            price: Precio objetivo
            message: Mensaje de la alerta

        Returns:
            {
                "alert_id": "alert_123",
                "status": "active",
                "message": "Alert created successfully"
            }
        """
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.trading_url}/api/alerts",
                json={
                    "symbol": symbol,
                    "condition": condition,
                    "price": price,
                    "message": message
                }
            )
            return response.json()

    async def get_market_data(
        self,
        symbol: str,
        timeframe: str = "5m",
        bars: int = 100
    ) -> Dict[str, Any]:
        """
        Obtiene datos de mercado historicos

        Returns:
            {
                "symbol": "XAUUSD",
                "timeframe": "5m",
                "data": [
                    {"time": "2024-12-08 10:00", "open": 2648, "high": 2652, ...},
                    ...
                ]
            }
        """
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{self.data_url}/api/market_data",
                params={
                    "symbol": symbol,
                    "timeframe": timeframe,
                    "bars": bars
                }
            )
            return response.json()

Tool Schema para LLM

# tools/tool_schema.py

TOOL_DEFINITIONS = [
    {
        "name": "get_ml_signal",
        "description": "Obtiene la senal de trading actual generada por los modelos de ML. Usa esto para conocer la recomendacion del sistema.",
        "parameters": {
            "type": "object",
            "properties": {
                "symbol": {
                    "type": "string",
                    "description": "Par de trading (XAUUSD, EURUSD, GBPUSD, USDJPY)",
                    "enum": ["XAUUSD", "EURUSD", "GBPUSD", "USDJPY"]
                },
                "timeframe": {
                    "type": "string",
                    "description": "Timeframe de analisis",
                    "enum": ["5m", "15m", "1h"],
                    "default": "5m"
                }
            },
            "required": ["symbol"]
        }
    },
    {
        "name": "analyze_market",
        "description": "Analiza el estado actual del mercado incluyendo fase AMD, contexto ICT, niveles clave y tendencia.",
        "parameters": {
            "type": "object",
            "properties": {
                "symbol": {
                    "type": "string",
                    "description": "Par de trading",
                    "enum": ["XAUUSD", "EURUSD", "GBPUSD", "USDJPY"]
                }
            },
            "required": ["symbol"]
        }
    },
    {
        "name": "execute_trade",
        "description": "Ejecuta una operacion de trading. IMPORTANTE: Siempre confirma con el usuario antes de ejecutar.",
        "parameters": {
            "type": "object",
            "properties": {
                "symbol": {
                    "type": "string",
                    "description": "Par de trading"
                },
                "action": {
                    "type": "string",
                    "description": "Tipo de orden",
                    "enum": ["BUY", "SELL"]
                },
                "size": {
                    "type": "number",
                    "description": "Tamano en lotes (0.01 - 10.0)"
                },
                "stop_loss": {
                    "type": "number",
                    "description": "Precio de stop loss"
                },
                "take_profit": {
                    "type": "number",
                    "description": "Precio de take profit"
                }
            },
            "required": ["symbol", "action", "size", "stop_loss", "take_profit"]
        }
    },
    {
        "name": "get_portfolio",
        "description": "Obtiene el estado actual del portfolio incluyendo balance, posiciones abiertas y P&L.",
        "parameters": {
            "type": "object",
            "properties": {},
            "required": []
        }
    },
    {
        "name": "set_alert",
        "description": "Configura una alerta de precio para notificacion.",
        "parameters": {
            "type": "object",
            "properties": {
                "symbol": {
                    "type": "string",
                    "description": "Par de trading"
                },
                "condition": {
                    "type": "string",
                    "enum": ["above", "below"]
                },
                "price": {
                    "type": "number",
                    "description": "Precio objetivo"
                },
                "message": {
                    "type": "string",
                    "description": "Mensaje de la alerta"
                }
            },
            "required": ["symbol", "condition", "price", "message"]
        }
    }
]

System Prompt

Prompt Principal

# prompts/system_prompt.py

SYSTEM_PROMPT = """Eres Trading Platform AI, un copiloto de trading especializado en mercados de forex y metales preciosos.

## Tu Rol
- Analizar senales de los modelos ML y explicarlas claramente
- Ayudar a tomar decisiones de trading informadas
- Ejecutar operaciones cuando el usuario lo autorice
- Gestionar riesgo y proteger el capital

## Conocimiento
Tienes acceso a modelos ML que analizan:
- **AMD (Accumulation-Manipulation-Distribution)**: Fases del mercado
  - Accumulation: Instituciones acumulando, mejor para longs
  - Manipulation: Caza de stops, evitar entradas
  - Distribution: Instituciones vendiendo, mejor para shorts
- **ICT Concepts**: Killzones, OTE, Premium/Discount
  - London Open (02:00-05:00 EST): Alta probabilidad
  - NY AM (08:30-11:00 EST): Maxima liquidez
  - Discount Zone (0-50%): Mejor para compras
  - Premium Zone (50-100%): Mejor para ventas
- **SMC**: BOS, CHOCH, Inducement, Displacement

## Herramientas Disponibles
1. `get_ml_signal(symbol, timeframe)` - Obtiene senal ML actual
2. `analyze_market(symbol)` - Analisis completo del mercado
3. `execute_trade(symbol, action, size, sl, tp)` - Ejecuta operacion
4. `get_portfolio()` - Estado del portfolio
5. `set_alert(symbol, condition, price, message)` - Crear alerta

## Reglas Importantes
1. **SIEMPRE** explica el razonamiento detras de cada recomendacion
2. **NUNCA** ejecutes trades sin confirmacion explicita del usuario
3. **SIEMPRE** menciona el riesgo (R:R, % de cuenta)
4. **PRIORIZA** la preservacion del capital
5. Si la confianza del modelo es <60%, recomienda ESPERAR
6. En fase de Manipulation, recomienda NO OPERAR

## Formato de Respuesta para Senales
Cuando presentes una senal, usa este formato:

**Senal: [LONG/SHORT/HOLD]**
- Simbolo: [SYMBOL]
- Confianza: [X]%
- Fase AMD: [fase]
- Killzone: [killzone]

**Niveles:**
- Entry: [precio]
- Stop Loss: [precio] ([X] pips)
- Take Profit: [precio] ([X] pips)
- R:R: [X]:1

**Razonamiento:**
1. [razon 1]
2. [razon 2]
3. [razon 3]

**Riesgo:** [X]% de la cuenta

---

## Contexto Actual
{market_context}

## Portfolio
{portfolio_context}
"""

def build_system_prompt(market_context: dict = None, portfolio_context: dict = None) -> str:
    """Construye system prompt con contexto"""

    market_str = ""
    if market_context:
        market_str = f"""
Precio actual: {market_context.get('current_price', 'N/A')}
Fase AMD: {market_context.get('amd_phase', 'N/A')}
Killzone: {market_context.get('killzone', 'N/A')}
Tendencia: {market_context.get('trend', 'N/A')}
"""

    portfolio_str = ""
    if portfolio_context:
        portfolio_str = f"""
Balance: ${portfolio_context.get('balance', 0):,.2f}
Equity: ${portfolio_context.get('equity', 0):,.2f}
Posiciones abiertas: {len(portfolio_context.get('positions', []))}
P&L diario: ${portfolio_context.get('daily_pnl', 0):,.2f}
"""

    return SYSTEM_PROMPT.format(
        market_context=market_str or "No disponible",
        portfolio_context=portfolio_str or "No disponible"
    )

API Endpoints

FastAPI Service

# services/llm_service.py

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import redis
import json

app = FastAPI(title="Trading Platform LLM Service")

class Message(BaseModel):
    role: str  # "user" or "assistant"
    content: str

class ChatRequest(BaseModel):
    message: str
    session_id: str
    symbol: Optional[str] = "XAUUSD"

class ChatResponse(BaseModel):
    response: str
    tools_used: List[str]
    signal: Optional[dict] = None

# Redis connection
redis_client = redis.Redis(host='localhost', port=6379, db=0)

@app.post("/api/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    """
    Endpoint principal de chat
    """
    try:
        # 1. Load conversation history
        history = load_conversation_history(request.session_id)

        # 2. Get market context
        market_context = await get_market_context(request.symbol)

        # 3. Get portfolio context
        portfolio_context = await get_portfolio_context()

        # 4. Build prompt
        system_prompt = build_system_prompt(market_context, portfolio_context)

        # 5. Call LLM
        response, tools_used = await call_llm(
            system_prompt=system_prompt,
            history=history,
            user_message=request.message
        )

        # 6. Save to history
        save_conversation(request.session_id, request.message, response)

        # 7. Extract signal if present
        signal = extract_signal_from_response(response)

        return ChatResponse(
            response=response,
            tools_used=tools_used,
            signal=signal
        )

    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/api/health")
async def health():
    """Health check"""
    return {"status": "healthy", "llm_status": await check_llm_status()}

@app.post("/api/clear_history")
async def clear_history(session_id: str):
    """Limpia historial de conversacion"""
    redis_client.delete(f"llm:history:{session_id}")
    return {"status": "cleared"}

async def call_llm(
    system_prompt: str,
    history: List[Message],
    user_message: str
) -> tuple[str, List[str]]:
    """
    Llama al LLM via Ollama
    """
    import httpx

    # Build messages
    messages = [{"role": "system", "content": system_prompt}]

    for msg in history:
        messages.append({"role": msg.role, "content": msg.content})

    messages.append({"role": "user", "content": user_message})

    # Call Ollama
    async with httpx.AsyncClient(timeout=60.0) as client:
        response = await client.post(
            "http://localhost:11434/api/chat",
            json={
                "model": "llama3:8b-instruct-q5_K_M",
                "messages": messages,
                "stream": False,
                "options": {
                    "temperature": 0.7,
                    "top_p": 0.9,
                    "num_predict": 2048
                }
            }
        )

        result = response.json()
        llm_response = result["message"]["content"]

    # Check for tool calls
    tools_used = []
    if "<tool>" in llm_response:
        llm_response, tools_used = await process_tool_calls(llm_response)

    return llm_response, tools_used

async def process_tool_calls(response: str) -> tuple[str, List[str]]:
    """
    Procesa llamadas a tools en la respuesta
    """
    import re

    tools_used = []
    tool_pattern = r"<tool>(\w+)\((.*?)\)</tool>"

    matches = re.findall(tool_pattern, response)

    for tool_name, args_str in matches:
        tools_used.append(tool_name)

        # Parse arguments
        args = json.loads(f"{{{args_str}}}")

        # Execute tool
        tool_result = await execute_tool(tool_name, args)

        # Replace in response
        response = response.replace(
            f"<tool>{tool_name}({args_str})</tool>",
            f"\n**{tool_name} result:**\n```json\n{json.dumps(tool_result, indent=2)}\n```\n"
        )

    return response, tools_used

Context Management

Redis Schema

# context/redis_schema.py

"""
Redis Keys Structure:

llm:history:{session_id}           - Conversation history (LIST)
llm:market_context:{symbol}        - Market context cache (STRING, TTL=60s)
llm:portfolio:{user_id}            - Portfolio cache (STRING, TTL=30s)
llm:user_prefs:{user_id}           - User preferences (HASH)
llm:alerts:{user_id}               - Active alerts (SET)
"""

import redis
from typing import List, Dict, Optional
import json

class ContextManager:
    def __init__(self, redis_client: redis.Redis):
        self.redis = redis_client
        self.history_ttl = 3600  # 1 hour
        self.cache_ttl = 60  # 60 seconds

    def get_conversation_history(
        self,
        session_id: str,
        max_messages: int = 10
    ) -> List[Dict]:
        """Obtiene historial de conversacion"""
        key = f"llm:history:{session_id}"
        history = self.redis.lrange(key, -max_messages, -1)
        return [json.loads(h) for h in history]

    def add_to_history(
        self,
        session_id: str,
        user_message: str,
        assistant_response: str
    ):
        """Agrega mensaje al historial"""
        key = f"llm:history:{session_id}"

        self.redis.rpush(key, json.dumps({
            "role": "user",
            "content": user_message
        }))

        self.redis.rpush(key, json.dumps({
            "role": "assistant",
            "content": assistant_response
        }))

        # Mantener solo ultimos N mensajes
        self.redis.ltrim(key, -20, -1)
        self.redis.expire(key, self.history_ttl)

    def get_market_context(self, symbol: str) -> Optional[Dict]:
        """Obtiene contexto de mercado cacheado"""
        key = f"llm:market_context:{symbol}"
        data = self.redis.get(key)
        return json.loads(data) if data else None

    def set_market_context(self, symbol: str, context: Dict):
        """Cachea contexto de mercado"""
        key = f"llm:market_context:{symbol}"
        self.redis.setex(key, self.cache_ttl, json.dumps(context))

    def get_user_preferences(self, user_id: str) -> Dict:
        """Obtiene preferencias del usuario"""
        key = f"llm:user_prefs:{user_id}"
        return self.redis.hgetall(key) or {}

    def set_user_preference(self, user_id: str, pref: str, value: str):
        """Establece preferencia del usuario"""
        key = f"llm:user_prefs:{user_id}"
        self.redis.hset(key, pref, value)

Implementacion

Docker Compose

IMPORTANTE: Los puertos deben seguir la politica definida en /core/devtools/environment/DEVENV-PORTS.md

Puertos asignados a trading-platform:

  • Rango base: 3600
  • Frontend: 5179
  • Backend API: 3600
  • Database: 5438 (o 5432 compartido)
  • Redis: 6385
  • MinIO: 9600/9601
# docker-compose.llm.yaml

version: '3.8'

services:
  ollama:
    image: ollama/ollama:latest
    container_name: trading-ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped

  llm-service:
    build:
      context: .
      dockerfile: Dockerfile.llm
    container_name: trading-llm
    ports:
      - "3602:3602"  # LLM service (base 3600 + 2)
    environment:
      - OLLAMA_URL=http://ollama:11434
      - REDIS_URL=redis://redis:6385
      - ML_ENGINE_URL=http://ml-engine:3601
      - TRADING_URL=http://trading-service:3603
    depends_on:
      - ollama
      - redis
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    container_name: trading-redis
    ports:
      - "6385:6379"  # Puerto asignado segun DEVENV-PORTS
    volumes:
      - redis_data:/data
    restart: unless-stopped

volumes:
  ollama_data:
  redis_data:

Script de Inicializacion

#!/bin/bash
# scripts/init_llm.sh

echo "=== Trading Platform LLM Setup ==="

# 1. Check GPU
echo "Checking GPU..."
nvidia-smi

# 2. Start Ollama
echo "Starting Ollama..."
ollama serve &
sleep 5

# 3. Pull model
echo "Pulling Llama3 model..."
ollama pull llama3:8b-instruct-q5_K_M

# 4. Test model
echo "Testing model..."
ollama run llama3:8b-instruct-q5_K_M "Hello, respond with OK if working"

# 5. Start services
echo "Starting LLM service..."
docker-compose -f docker-compose.llm.yaml up -d

echo "=== Setup Complete ==="
echo "LLM Service: http://localhost:3602"
echo "Ollama API: http://localhost:11434"

Testing y Validacion

Test Cases

# tests/test_llm_service.py

import pytest
import httpx

LLM_URL = "http://localhost:3602"  # Puerto asignado segun DEVENV-PORTS (base 3600 + 2)

@pytest.mark.asyncio
async def test_health_check():
    """Test health endpoint"""
    async with httpx.AsyncClient() as client:
        response = await client.get(f"{LLM_URL}/api/health")
        assert response.status_code == 200
        assert response.json()["status"] == "healthy"

@pytest.mark.asyncio
async def test_simple_chat():
    """Test basic chat"""
    async with httpx.AsyncClient(timeout=60.0) as client:
        response = await client.post(
            f"{LLM_URL}/api/chat",
            json={
                "message": "Hola, como estas?",
                "session_id": "test_session"
            }
        )
        assert response.status_code == 200
        assert len(response.json()["response"]) > 0

@pytest.mark.asyncio
async def test_get_signal():
    """Test signal retrieval via LLM"""
    async with httpx.AsyncClient(timeout=60.0) as client:
        response = await client.post(
            f"{LLM_URL}/api/chat",
            json={
                "message": "Dame la senal actual para XAUUSD",
                "session_id": "test_signal",
                "symbol": "XAUUSD"
            }
        )
        assert response.status_code == 200
        data = response.json()
        assert "get_ml_signal" in data["tools_used"]

@pytest.mark.asyncio
async def test_response_time():
    """Test response time < 3s"""
    import time

    async with httpx.AsyncClient(timeout=60.0) as client:
        start = time.time()
        response = await client.post(
            f"{LLM_URL}/api/chat",
            json={
                "message": "Analiza el mercado de XAUUSD",
                "session_id": "test_perf"
            }
        )
        elapsed = time.time() - start

        assert response.status_code == 200
        assert elapsed < 5.0  # 5s max (including tool calls)

Metricas de Validacion

Metrica Target Como Medir
Response Time <3s pytest benchmark
Tool Accuracy >95% Manual review
Context Retention 100% Test history
GPU Memory <14GB nvidia-smi
Uptime >99% Monitoring

Documento Generado: 2025-12-08 Trading Strategist - Trading Platform