ML Engine Updates: - Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records - Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence) - Backtest results: +176.71R profit with aggressive_filter strategy Documentation Consolidation: - Created docs/99-analisis/_MAP.md index with 13 new analysis documents - Consolidated inventories: removed duplicates from orchestration/inventarios/ - Updated ML_INVENTORY.yml with BTCUSD metrics and training results - Added execution reports: FASE11-BTCUSD, correction issues, alignment validation Architecture & Integration: - Updated all module documentation with NEXUS v3.4 frontmatter - Fixed _MAP.md indexes across all folders - Updated orchestration plans and traces Files: 229 changed, 5064 insertions(+), 1872 deletions(-) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
39 KiB
39 KiB
| id | title | type | project | version | updated_date |
|---|---|---|---|---|---|
| INTEGRACION-LLM-LOCAL | Integracion LLM Local - chatgpt-oss 16GB | Documentation | trading-platform | 1.0.0 | 2026-01-04 |
Integracion LLM Local - chatgpt-oss 16GB
Version: 1.0.0 Fecha: 2025-12-08 Modulo: OQI-007-llm-agent Autor: Trading Strategist - Trading Platform
Tabla de Contenidos
- Vision General
- Especificaciones de Hardware
- Arquitectura de Integracion
- Configuracion del Modelo
- Trading Tools
- System Prompt
- API Endpoints
- Context Management
- Implementacion
- Testing y Validacion
Vision General
Objetivo
Integrar un modelo LLM local (chatgpt-oss o equivalente) que funcione como copiloto de trading, capaz de:
- Analizar senales ML y explicarlas en lenguaje natural
- Tomar decisiones de trading basadas en contexto
- Ejecutar operaciones via MetaTrader4
- Responder preguntas sobre mercados y estrategias
- Gestionar alertas y notificaciones
Arquitectura de Alto Nivel
┌─────────────────────────────────────────────────────────────────────────────┐
│ LLM TRADING COPILOT │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ User Interface │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Chat Interface / CLI / Telegram Bot │ │
│ └───────────────────────────────────┬─────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ LLM Service (FastAPI) │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐│ │
│ │ │ Request Handler ││ │
│ │ │ - Parse user input ││ │
│ │ │ - Load context from Redis ││ │
│ │ │ - Build prompt ││ │
│ │ └──────────────────────────────────┬──────────────────────────────┘│ │
│ │ │ │ │
│ │ ┌──────────────────────────────────▼──────────────────────────────┐│ │
│ │ │ LLM Engine (chatgpt-oss / Llama3) ││ │
│ │ │ ┌────────────────────────────────────────────────────────────┐ ││ │
│ │ │ │ Ollama / vLLM / llama.cpp │ ││ │
│ │ │ │ - GPU: NVIDIA RTX 5060 Ti 16GB │ ││ │
│ │ │ │ - Context: 8K tokens │ ││ │
│ │ │ │ - Response time: <3s │ ││ │
│ │ │ └────────────────────────────────────────────────────────────┘ ││ │
│ │ └──────────────────────────────────┬──────────────────────────────┘│ │
│ │ │ │ │
│ │ ┌──────────────────────────────────▼──────────────────────────────┐│ │
│ │ │ Tool Executor ││ │
│ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ││ │
│ │ │ │get_signal│ │ analyze │ │ execute │ │ portfolio│ ││ │
│ │ │ │ │ │ _market │ │ _trade │ │ _status │ ││ │
│ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ ││ │
│ │ │ │ │ │ │ ││ │
│ │ └───────┼────────────┼────────────┼────────────┼──────────────────┘│ │
│ └──────────┼────────────┼────────────┼────────────┼────────────────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ External Services │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ ML Engine │ │ PostgreSQL │ │ MetaTrader4 │ │ Redis │ │ │
│ │ │ (signals) │ │ (data) │ │ (trading) │ │ (context) │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Especificaciones de Hardware
GPU Disponible
| Especificacion | Valor |
|---|---|
| Modelo | NVIDIA RTX 5060 Ti |
| VRAM | 16GB GDDR6X |
| CUDA Cores | TBD |
| Tensor Cores | Si |
Modelos Compatibles
| Modelo | VRAM Req | Context | Velocidad | Recomendacion |
|---|---|---|---|---|
| Llama 3 8B | ~10GB | 8K | ~20 tok/s | Recomendado |
| Mistral 7B | ~8GB | 8K | ~25 tok/s | Alternativa |
| Qwen2 7B | ~9GB | 32K | ~22 tok/s | Alternativa |
| Phi-3 Mini 3.8B | ~4GB | 4K | ~40 tok/s | Backup |
Configuracion Optima
# Llama 3 8B Instruct
model:
name: llama3:8b-instruct-q5_K_M
quantization: Q5_K_M # Balance calidad/VRAM
context_length: 8192
batch_size: 512
gpu:
device: cuda:0
memory_fraction: 0.85 # 13.6GB de 16GB
offload: false
inference:
temperature: 0.7
top_p: 0.9
max_tokens: 2048
repeat_penalty: 1.1
Arquitectura de Integracion
Componentes
┌──────────────────────────────────────────────────────────────────────┐
│ LLM SERVICE COMPONENTS │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ 1. LLM Runtime (Ollama) │
│ ├── Model Server (GPU) │
│ ├── REST API (:11434) │
│ └── Model Management │
│ │
│ 2. Trading Service (FastAPI) │
│ ├── Chat Endpoint (/api/chat) │
│ ├── Tool Executor │
│ └── Response Formatter │
│ │
│ 3. Context Manager (Redis) │
│ ├── Conversation History │
│ ├── Market Context │
│ └── User Preferences │
│ │
│ 4. Trading Tools │
│ ├── get_ml_signal() │
│ ├── analyze_market() │
│ ├── execute_trade() │
│ ├── get_portfolio() │
│ └── set_alert() │
│ │
└──────────────────────────────────────────────────────────────────────┘
Flujo de Request
User Message
│
▼
┌─────────────────┐
│ FastAPI Handler │
└────────┬────────┘
│
▼
┌─────────────────┐ ┌─────────────────┐
│ Load Context │◄───▶│ Redis │
│ from Redis │ └─────────────────┘
└────────┬────────┘
│
▼
┌─────────────────┐
│ Build Prompt │
│ (system + hist │
│ + user msg) │
└────────┬────────┘
│
▼
┌─────────────────┐ ┌─────────────────┐
│ LLM Inference │◄───▶│ Ollama (GPU) │
│ │ └─────────────────┘
└────────┬────────┘
│
▼
┌─────────────────┐
│ Parse Response │
│ Check for Tools │
└────────┬────────┘
│
┌────┴────┐
│ Tools? │
└────┬────┘
Yes │ No
│ │
▼ ▼
┌───────┐ ┌───────────────┐
│Execute│ │Return Response│
│ Tools │ │ │
└───┬───┘ └───────────────┘
│
▼
┌─────────────────┐
│ Format Results │
│ Send to LLM │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Final Response │
│ Save to Context │
└─────────────────┘
Configuracion del Modelo
Instalacion Ollama
# Instalar Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Verificar GPU
ollama run --verbose llama3:8b-instruct-q5_K_M
# Descargar modelo
ollama pull llama3:8b-instruct-q5_K_M
# Verificar VRAM
nvidia-smi
Archivo de Configuracion
# config/llm_config.yaml
llm:
provider: ollama
model: llama3:8b-instruct-q5_K_M
base_url: http://localhost:11434
inference:
temperature: 0.7
top_p: 0.9
max_tokens: 2048
stop_sequences:
- "</tool>"
- "Human:"
context:
max_history: 10 # ultimos 10 mensajes
include_market_context: true
include_portfolio: true
redis:
host: localhost
port: 6379
db: 0
prefix: "llm:"
ttl: 3600 # 1 hora
tools:
timeout: 30 # segundos por tool
retry_count: 3
parallel_execution: true
logging:
level: INFO
file: logs/llm_service.log
Trading Tools
Tool Definitions
# tools/trading_tools.py
from typing import Dict, Any, Optional
import httpx
class TradingTools:
"""
Trading tools disponibles para el LLM
"""
def __init__(self, config: Dict):
self.ml_engine_url = config['ml_engine_url']
self.trading_url = config['trading_url']
self.data_url = config['data_url']
async def get_ml_signal(
self,
symbol: str,
timeframe: str = "5m"
) -> Dict[str, Any]:
"""
Obtiene senal ML actual para un simbolo
Args:
symbol: Par de trading (XAUUSD, EURUSD, etc.)
timeframe: Timeframe (5m, 15m, 1h)
Returns:
{
"action": "LONG" | "SHORT" | "HOLD",
"confidence": 0.78,
"entry_price": 2650.50,
"stop_loss": 2645.20,
"take_profit": 2661.10,
"risk_reward": 2.0,
"amd_phase": "accumulation",
"killzone": "london_open",
"reasoning": ["AMD: Accumulation (78%)", ...]
}
"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.ml_engine_url}/api/signal",
params={"symbol": symbol, "timeframe": timeframe}
)
return response.json()
async def analyze_market(
self,
symbol: str
) -> Dict[str, Any]:
"""
Analiza el estado actual del mercado
Returns:
{
"symbol": "XAUUSD",
"current_price": 2650.50,
"amd_phase": "accumulation",
"ict_context": {
"killzone": "london_open",
"ote_zone": "discount",
"score": 0.72
},
"key_levels": {
"resistance": [2660.00, 2675.50],
"support": [2640.00, 2625.00]
},
"recent_signals": [...],
"trend": "bullish",
"volatility": "medium"
}
"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.ml_engine_url}/api/analysis",
params={"symbol": symbol}
)
return response.json()
async def execute_trade(
self,
symbol: str,
action: str,
size: float,
stop_loss: float,
take_profit: float,
account_id: Optional[str] = None
) -> Dict[str, Any]:
"""
Ejecuta una operacion de trading
Args:
symbol: Par de trading
action: "BUY" o "SELL"
size: Tamano de posicion (lots)
stop_loss: Precio de stop loss
take_profit: Precio de take profit
account_id: ID de cuenta MT4 (opcional)
Returns:
{
"success": true,
"ticket": 123456,
"executed_price": 2650.45,
"slippage": 0.05,
"message": "Order executed successfully"
}
"""
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.trading_url}/api/trade",
json={
"symbol": symbol,
"action": action,
"size": size,
"stop_loss": stop_loss,
"take_profit": take_profit,
"account_id": account_id
}
)
return response.json()
async def get_portfolio(
self,
account_id: Optional[str] = None
) -> Dict[str, Any]:
"""
Obtiene estado del portfolio
Returns:
{
"balance": 10000.00,
"equity": 10150.00,
"margin": 500.00,
"free_margin": 9650.00,
"positions": [
{
"ticket": 123456,
"symbol": "XAUUSD",
"type": "BUY",
"size": 0.1,
"open_price": 2640.00,
"current_price": 2650.50,
"profit": 105.00,
"stop_loss": 2630.00,
"take_profit": 2660.00
}
],
"daily_pnl": 250.00,
"weekly_pnl": 450.00
}
"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.trading_url}/api/portfolio",
params={"account_id": account_id}
)
return response.json()
async def set_alert(
self,
symbol: str,
condition: str,
price: float,
message: str
) -> Dict[str, Any]:
"""
Configura una alerta de precio
Args:
symbol: Par de trading
condition: "above" o "below"
price: Precio objetivo
message: Mensaje de la alerta
Returns:
{
"alert_id": "alert_123",
"status": "active",
"message": "Alert created successfully"
}
"""
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.trading_url}/api/alerts",
json={
"symbol": symbol,
"condition": condition,
"price": price,
"message": message
}
)
return response.json()
async def get_market_data(
self,
symbol: str,
timeframe: str = "5m",
bars: int = 100
) -> Dict[str, Any]:
"""
Obtiene datos de mercado historicos
Returns:
{
"symbol": "XAUUSD",
"timeframe": "5m",
"data": [
{"time": "2024-12-08 10:00", "open": 2648, "high": 2652, ...},
...
]
}
"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.data_url}/api/market_data",
params={
"symbol": symbol,
"timeframe": timeframe,
"bars": bars
}
)
return response.json()
Tool Schema para LLM
# tools/tool_schema.py
TOOL_DEFINITIONS = [
{
"name": "get_ml_signal",
"description": "Obtiene la senal de trading actual generada por los modelos de ML. Usa esto para conocer la recomendacion del sistema.",
"parameters": {
"type": "object",
"properties": {
"symbol": {
"type": "string",
"description": "Par de trading (XAUUSD, EURUSD, GBPUSD, USDJPY)",
"enum": ["XAUUSD", "EURUSD", "GBPUSD", "USDJPY"]
},
"timeframe": {
"type": "string",
"description": "Timeframe de analisis",
"enum": ["5m", "15m", "1h"],
"default": "5m"
}
},
"required": ["symbol"]
}
},
{
"name": "analyze_market",
"description": "Analiza el estado actual del mercado incluyendo fase AMD, contexto ICT, niveles clave y tendencia.",
"parameters": {
"type": "object",
"properties": {
"symbol": {
"type": "string",
"description": "Par de trading",
"enum": ["XAUUSD", "EURUSD", "GBPUSD", "USDJPY"]
}
},
"required": ["symbol"]
}
},
{
"name": "execute_trade",
"description": "Ejecuta una operacion de trading. IMPORTANTE: Siempre confirma con el usuario antes de ejecutar.",
"parameters": {
"type": "object",
"properties": {
"symbol": {
"type": "string",
"description": "Par de trading"
},
"action": {
"type": "string",
"description": "Tipo de orden",
"enum": ["BUY", "SELL"]
},
"size": {
"type": "number",
"description": "Tamano en lotes (0.01 - 10.0)"
},
"stop_loss": {
"type": "number",
"description": "Precio de stop loss"
},
"take_profit": {
"type": "number",
"description": "Precio de take profit"
}
},
"required": ["symbol", "action", "size", "stop_loss", "take_profit"]
}
},
{
"name": "get_portfolio",
"description": "Obtiene el estado actual del portfolio incluyendo balance, posiciones abiertas y P&L.",
"parameters": {
"type": "object",
"properties": {},
"required": []
}
},
{
"name": "set_alert",
"description": "Configura una alerta de precio para notificacion.",
"parameters": {
"type": "object",
"properties": {
"symbol": {
"type": "string",
"description": "Par de trading"
},
"condition": {
"type": "string",
"enum": ["above", "below"]
},
"price": {
"type": "number",
"description": "Precio objetivo"
},
"message": {
"type": "string",
"description": "Mensaje de la alerta"
}
},
"required": ["symbol", "condition", "price", "message"]
}
}
]
System Prompt
Prompt Principal
# prompts/system_prompt.py
SYSTEM_PROMPT = """Eres Trading Platform AI, un copiloto de trading especializado en mercados de forex y metales preciosos.
## Tu Rol
- Analizar senales de los modelos ML y explicarlas claramente
- Ayudar a tomar decisiones de trading informadas
- Ejecutar operaciones cuando el usuario lo autorice
- Gestionar riesgo y proteger el capital
## Conocimiento
Tienes acceso a modelos ML que analizan:
- **AMD (Accumulation-Manipulation-Distribution)**: Fases del mercado
- Accumulation: Instituciones acumulando, mejor para longs
- Manipulation: Caza de stops, evitar entradas
- Distribution: Instituciones vendiendo, mejor para shorts
- **ICT Concepts**: Killzones, OTE, Premium/Discount
- London Open (02:00-05:00 EST): Alta probabilidad
- NY AM (08:30-11:00 EST): Maxima liquidez
- Discount Zone (0-50%): Mejor para compras
- Premium Zone (50-100%): Mejor para ventas
- **SMC**: BOS, CHOCH, Inducement, Displacement
## Herramientas Disponibles
1. `get_ml_signal(symbol, timeframe)` - Obtiene senal ML actual
2. `analyze_market(symbol)` - Analisis completo del mercado
3. `execute_trade(symbol, action, size, sl, tp)` - Ejecuta operacion
4. `get_portfolio()` - Estado del portfolio
5. `set_alert(symbol, condition, price, message)` - Crear alerta
## Reglas Importantes
1. **SIEMPRE** explica el razonamiento detras de cada recomendacion
2. **NUNCA** ejecutes trades sin confirmacion explicita del usuario
3. **SIEMPRE** menciona el riesgo (R:R, % de cuenta)
4. **PRIORIZA** la preservacion del capital
5. Si la confianza del modelo es <60%, recomienda ESPERAR
6. En fase de Manipulation, recomienda NO OPERAR
## Formato de Respuesta para Senales
Cuando presentes una senal, usa este formato:
**Senal: [LONG/SHORT/HOLD]**
- Simbolo: [SYMBOL]
- Confianza: [X]%
- Fase AMD: [fase]
- Killzone: [killzone]
**Niveles:**
- Entry: [precio]
- Stop Loss: [precio] ([X] pips)
- Take Profit: [precio] ([X] pips)
- R:R: [X]:1
**Razonamiento:**
1. [razon 1]
2. [razon 2]
3. [razon 3]
**Riesgo:** [X]% de la cuenta
---
## Contexto Actual
{market_context}
## Portfolio
{portfolio_context}
"""
def build_system_prompt(market_context: dict = None, portfolio_context: dict = None) -> str:
"""Construye system prompt con contexto"""
market_str = ""
if market_context:
market_str = f"""
Precio actual: {market_context.get('current_price', 'N/A')}
Fase AMD: {market_context.get('amd_phase', 'N/A')}
Killzone: {market_context.get('killzone', 'N/A')}
Tendencia: {market_context.get('trend', 'N/A')}
"""
portfolio_str = ""
if portfolio_context:
portfolio_str = f"""
Balance: ${portfolio_context.get('balance', 0):,.2f}
Equity: ${portfolio_context.get('equity', 0):,.2f}
Posiciones abiertas: {len(portfolio_context.get('positions', []))}
P&L diario: ${portfolio_context.get('daily_pnl', 0):,.2f}
"""
return SYSTEM_PROMPT.format(
market_context=market_str or "No disponible",
portfolio_context=portfolio_str or "No disponible"
)
API Endpoints
FastAPI Service
# services/llm_service.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import redis
import json
app = FastAPI(title="Trading Platform LLM Service")
class Message(BaseModel):
role: str # "user" or "assistant"
content: str
class ChatRequest(BaseModel):
message: str
session_id: str
symbol: Optional[str] = "XAUUSD"
class ChatResponse(BaseModel):
response: str
tools_used: List[str]
signal: Optional[dict] = None
# Redis connection
redis_client = redis.Redis(host='localhost', port=6379, db=0)
@app.post("/api/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
"""
Endpoint principal de chat
"""
try:
# 1. Load conversation history
history = load_conversation_history(request.session_id)
# 2. Get market context
market_context = await get_market_context(request.symbol)
# 3. Get portfolio context
portfolio_context = await get_portfolio_context()
# 4. Build prompt
system_prompt = build_system_prompt(market_context, portfolio_context)
# 5. Call LLM
response, tools_used = await call_llm(
system_prompt=system_prompt,
history=history,
user_message=request.message
)
# 6. Save to history
save_conversation(request.session_id, request.message, response)
# 7. Extract signal if present
signal = extract_signal_from_response(response)
return ChatResponse(
response=response,
tools_used=tools_used,
signal=signal
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/api/health")
async def health():
"""Health check"""
return {"status": "healthy", "llm_status": await check_llm_status()}
@app.post("/api/clear_history")
async def clear_history(session_id: str):
"""Limpia historial de conversacion"""
redis_client.delete(f"llm:history:{session_id}")
return {"status": "cleared"}
async def call_llm(
system_prompt: str,
history: List[Message],
user_message: str
) -> tuple[str, List[str]]:
"""
Llama al LLM via Ollama
"""
import httpx
# Build messages
messages = [{"role": "system", "content": system_prompt}]
for msg in history:
messages.append({"role": msg.role, "content": msg.content})
messages.append({"role": "user", "content": user_message})
# Call Ollama
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
"http://localhost:11434/api/chat",
json={
"model": "llama3:8b-instruct-q5_K_M",
"messages": messages,
"stream": False,
"options": {
"temperature": 0.7,
"top_p": 0.9,
"num_predict": 2048
}
}
)
result = response.json()
llm_response = result["message"]["content"]
# Check for tool calls
tools_used = []
if "<tool>" in llm_response:
llm_response, tools_used = await process_tool_calls(llm_response)
return llm_response, tools_used
async def process_tool_calls(response: str) -> tuple[str, List[str]]:
"""
Procesa llamadas a tools en la respuesta
"""
import re
tools_used = []
tool_pattern = r"<tool>(\w+)\((.*?)\)</tool>"
matches = re.findall(tool_pattern, response)
for tool_name, args_str in matches:
tools_used.append(tool_name)
# Parse arguments
args = json.loads(f"{{{args_str}}}")
# Execute tool
tool_result = await execute_tool(tool_name, args)
# Replace in response
response = response.replace(
f"<tool>{tool_name}({args_str})</tool>",
f"\n**{tool_name} result:**\n```json\n{json.dumps(tool_result, indent=2)}\n```\n"
)
return response, tools_used
Context Management
Redis Schema
# context/redis_schema.py
"""
Redis Keys Structure:
llm:history:{session_id} - Conversation history (LIST)
llm:market_context:{symbol} - Market context cache (STRING, TTL=60s)
llm:portfolio:{user_id} - Portfolio cache (STRING, TTL=30s)
llm:user_prefs:{user_id} - User preferences (HASH)
llm:alerts:{user_id} - Active alerts (SET)
"""
import redis
from typing import List, Dict, Optional
import json
class ContextManager:
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
self.history_ttl = 3600 # 1 hour
self.cache_ttl = 60 # 60 seconds
def get_conversation_history(
self,
session_id: str,
max_messages: int = 10
) -> List[Dict]:
"""Obtiene historial de conversacion"""
key = f"llm:history:{session_id}"
history = self.redis.lrange(key, -max_messages, -1)
return [json.loads(h) for h in history]
def add_to_history(
self,
session_id: str,
user_message: str,
assistant_response: str
):
"""Agrega mensaje al historial"""
key = f"llm:history:{session_id}"
self.redis.rpush(key, json.dumps({
"role": "user",
"content": user_message
}))
self.redis.rpush(key, json.dumps({
"role": "assistant",
"content": assistant_response
}))
# Mantener solo ultimos N mensajes
self.redis.ltrim(key, -20, -1)
self.redis.expire(key, self.history_ttl)
def get_market_context(self, symbol: str) -> Optional[Dict]:
"""Obtiene contexto de mercado cacheado"""
key = f"llm:market_context:{symbol}"
data = self.redis.get(key)
return json.loads(data) if data else None
def set_market_context(self, symbol: str, context: Dict):
"""Cachea contexto de mercado"""
key = f"llm:market_context:{symbol}"
self.redis.setex(key, self.cache_ttl, json.dumps(context))
def get_user_preferences(self, user_id: str) -> Dict:
"""Obtiene preferencias del usuario"""
key = f"llm:user_prefs:{user_id}"
return self.redis.hgetall(key) or {}
def set_user_preference(self, user_id: str, pref: str, value: str):
"""Establece preferencia del usuario"""
key = f"llm:user_prefs:{user_id}"
self.redis.hset(key, pref, value)
Implementacion
Docker Compose
IMPORTANTE: Los puertos deben seguir la politica definida en /core/devtools/environment/DEVENV-PORTS.md
Puertos asignados a trading-platform:
- Rango base: 3600
- Frontend: 5179
- Backend API: 3600
- Database: 5438 (o 5432 compartido)
- Redis: 6385
- MinIO: 9600/9601
# docker-compose.llm.yaml
version: '3.8'
services:
ollama:
image: ollama/ollama:latest
container_name: trading-ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
llm-service:
build:
context: .
dockerfile: Dockerfile.llm
container_name: trading-llm
ports:
- "3602:3602" # LLM service (base 3600 + 2)
environment:
- OLLAMA_URL=http://ollama:11434
- REDIS_URL=redis://redis:6385
- ML_ENGINE_URL=http://ml-engine:3601
- TRADING_URL=http://trading-service:3603
depends_on:
- ollama
- redis
restart: unless-stopped
redis:
image: redis:7-alpine
container_name: trading-redis
ports:
- "6385:6379" # Puerto asignado segun DEVENV-PORTS
volumes:
- redis_data:/data
restart: unless-stopped
volumes:
ollama_data:
redis_data:
Script de Inicializacion
#!/bin/bash
# scripts/init_llm.sh
echo "=== Trading Platform LLM Setup ==="
# 1. Check GPU
echo "Checking GPU..."
nvidia-smi
# 2. Start Ollama
echo "Starting Ollama..."
ollama serve &
sleep 5
# 3. Pull model
echo "Pulling Llama3 model..."
ollama pull llama3:8b-instruct-q5_K_M
# 4. Test model
echo "Testing model..."
ollama run llama3:8b-instruct-q5_K_M "Hello, respond with OK if working"
# 5. Start services
echo "Starting LLM service..."
docker-compose -f docker-compose.llm.yaml up -d
echo "=== Setup Complete ==="
echo "LLM Service: http://localhost:3602"
echo "Ollama API: http://localhost:11434"
Testing y Validacion
Test Cases
# tests/test_llm_service.py
import pytest
import httpx
LLM_URL = "http://localhost:3602" # Puerto asignado segun DEVENV-PORTS (base 3600 + 2)
@pytest.mark.asyncio
async def test_health_check():
"""Test health endpoint"""
async with httpx.AsyncClient() as client:
response = await client.get(f"{LLM_URL}/api/health")
assert response.status_code == 200
assert response.json()["status"] == "healthy"
@pytest.mark.asyncio
async def test_simple_chat():
"""Test basic chat"""
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
f"{LLM_URL}/api/chat",
json={
"message": "Hola, como estas?",
"session_id": "test_session"
}
)
assert response.status_code == 200
assert len(response.json()["response"]) > 0
@pytest.mark.asyncio
async def test_get_signal():
"""Test signal retrieval via LLM"""
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
f"{LLM_URL}/api/chat",
json={
"message": "Dame la senal actual para XAUUSD",
"session_id": "test_signal",
"symbol": "XAUUSD"
}
)
assert response.status_code == 200
data = response.json()
assert "get_ml_signal" in data["tools_used"]
@pytest.mark.asyncio
async def test_response_time():
"""Test response time < 3s"""
import time
async with httpx.AsyncClient(timeout=60.0) as client:
start = time.time()
response = await client.post(
f"{LLM_URL}/api/chat",
json={
"message": "Analiza el mercado de XAUUSD",
"session_id": "test_perf"
}
)
elapsed = time.time() - start
assert response.status_code == 200
assert elapsed < 5.0 # 5s max (including tool calls)
Metricas de Validacion
| Metrica | Target | Como Medir |
|---|---|---|
| Response Time | <3s | pytest benchmark |
| Tool Accuracy | >95% | Manual review |
| Context Retention | 100% | Test history |
| GPU Memory | <14GB | nvidia-smi |
| Uptime | >99% | Monitoring |
Documento Generado: 2025-12-08 Trading Strategist - Trading Platform