ML Engine Updates: - Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records - Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence) - Backtest results: +176.71R profit with aggressive_filter strategy Documentation Consolidation: - Created docs/99-analisis/_MAP.md index with 13 new analysis documents - Consolidated inventories: removed duplicates from orchestration/inventarios/ - Updated ML_INVENTORY.yml with BTCUSD metrics and training results - Added execution reports: FASE11-BTCUSD, correction issues, alignment validation Architecture & Integration: - Updated all module documentation with NEXUS v3.4 frontmatter - Fixed _MAP.md indexes across all folders - Updated orchestration plans and traces Files: 229 changed, 5064 insertions(+), 1872 deletions(-) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
21 KiB
21 KiB
| id | title | type | status | priority | epic | project | version | created_date | updated_date |
|---|---|---|---|---|---|---|---|---|---|
| ET-ML-001 | Arquitectura ML Engine | Technical Specification | Done | Alta | OQI-006 | trading-platform | 1.0.0 | 2025-12-05 | 2026-01-04 |
ET-ML-001: Arquitectura ML Engine
Metadata
| Campo | Valor |
|---|---|
| ID | ET-ML-001 |
| Épica | OQI-006 - Señales ML |
| Tipo | Especificación Técnica |
| Versión | 1.0.0 |
| Estado | Aprobado |
| Última actualización | 2025-12-05 |
Propósito
Definir la arquitectura completa del ML Engine, incluyendo la estructura de servicios, comunicación entre componentes, y flujos de datos para predicciones y señales de trading.
Arquitectura General
Diagrama de Alto Nivel
┌─────────────────────────────────────────────────────────────────────────────┐
│ ML ENGINE │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Application │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Routers │ │ Services │ │ Background Tasks │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ /predictions │ │ Predictor │ │ ┌─────────────────────┐ │ │ │
│ │ │ /signals │ │ SignalGen │ │ │ Market Data Fetcher │ │ │ │
│ │ │ /indicators │ │ Indicators │ │ └─────────────────────┘ │ │ │
│ │ │ /models │ │ ModelManager │ │ ┌─────────────────────┐ │ │ │
│ │ │ /health │ │ │ │ │ Training Pipeline │ │ │ │
│ │ └──────────────┘ └──────────────┘ │ └─────────────────────┘ │ │ │
│ │ │ ┌─────────────────────┐ │ │ │
│ │ ┌─────────────────────────────────┐ │ │ Signal Publisher │ │ │ │
│ │ │ Models Layer │ │ └─────────────────────┘ │ │ │
│ │ │ ┌─────────────┐ ┌───────────┐ │ └──────────────────────────┘ │ │
│ │ │ │ XGBoost │ │ Ensemble │ │ │ │
│ │ │ │ Models │ │ Manager │ │ │ │
│ │ │ └─────────────┘ └───────────┘ │ │ │
│ │ └─────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Data Layer │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Redis │ │ PostgreSQL │ │ Model Storage │ │ │
│ │ │ Cache │ │ (Metrics) │ │ (File System) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Binance │ │ Backend │ │ Frontend │
│ API │ │ Express │ │ React │
└───────────┘ └───────────┘ └───────────┘
Estructura de Directorios
apps/ml-engine/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI entry point
│ ├── config/
│ │ ├── __init__.py
│ │ ├── settings.py # Pydantic settings
│ │ └── logging.py # Logging configuration
│ ├── api/
│ │ ├── __init__.py
│ │ ├── dependencies.py # Dependency injection
│ │ └── routers/
│ │ ├── __init__.py
│ │ ├── predictions.py # /predictions endpoints
│ │ ├── signals.py # /signals endpoints
│ │ ├── indicators.py # /indicators endpoints
│ │ ├── models.py # /models endpoints
│ │ └── health.py # /health endpoints
│ ├── core/
│ │ ├── __init__.py
│ │ ├── exceptions.py # Custom exceptions
│ │ ├── security.py # API key validation
│ │ └── middleware.py # Custom middleware
│ ├── services/
│ │ ├── __init__.py
│ │ ├── predictor.py # Prediction service
│ │ ├── signal_generator.py # Signal generation
│ │ ├── indicator_calculator.py # Technical indicators
│ │ ├── model_manager.py # Model lifecycle
│ │ └── market_data.py # Data fetching
│ ├── models/
│ │ ├── __init__.py
│ │ ├── base.py # Base model class
│ │ ├── range_predictor.py # Price range predictor
│ │ ├── tpsl_classifier.py # TP/SL classifier
│ │ └── signal_classifier.py # Signal classifier
│ ├── features/
│ │ ├── __init__.py
│ │ ├── builder.py # Feature builder
│ │ ├── volatility.py # Volatility features
│ │ ├── momentum.py # Momentum features
│ │ ├── trend.py # Trend features
│ │ └── volume.py # Volume features
│ ├── schemas/
│ │ ├── __init__.py
│ │ ├── prediction.py # Prediction DTOs
│ │ ├── signal.py # Signal DTOs
│ │ └── indicator.py # Indicator DTOs
│ └── tasks/
│ ├── __init__.py
│ ├── market_data_fetcher.py # Background data fetch
│ ├── training_pipeline.py # Model training
│ └── signal_publisher.py # Signal broadcast
├── data/
│ ├── models/ # Trained model files
│ │ ├── range_predictor/
│ │ ├── tpsl_classifier/
│ │ └── signal_classifier/
│ └── cache/ # Cached market data
├── tests/
│ ├── __init__.py
│ ├── conftest.py
│ ├── unit/
│ └── integration/
├── scripts/
│ ├── train_models.py
│ └── evaluate_models.py
├── requirements.txt
├── Dockerfile
└── docker-compose.yml
Componentes Principales
1. FastAPI Application
# app/main.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from app.config.settings import settings
from app.api.routers import predictions, signals, indicators, models, health
from app.tasks.market_data_fetcher import MarketDataFetcher
from app.services.model_manager import ModelManager
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
model_manager = ModelManager()
await model_manager.load_all_models()
market_fetcher = MarketDataFetcher()
await market_fetcher.start()
yield
# Shutdown
await market_fetcher.stop()
app = FastAPI(
title="Trading Platform ML Engine",
version="1.0.0",
lifespan=lifespan
)
app.add_middleware(
CORSMiddleware,
allow_origins=settings.CORS_ORIGINS,
allow_methods=["*"],
allow_headers=["*"],
)
app.include_router(health.router, tags=["Health"])
app.include_router(predictions.router, prefix="/predictions", tags=["Predictions"])
app.include_router(signals.router, prefix="/signals", tags=["Signals"])
app.include_router(indicators.router, prefix="/indicators", tags=["Indicators"])
app.include_router(models.router, prefix="/models", tags=["Models"])
2. Configuration
# app/config/settings.py
from pydantic_settings import BaseSettings
from typing import List
class Settings(BaseSettings):
# API
API_HOST: str = "0.0.0.0"
API_PORT: int = 8000
API_KEY: str
# Database
DATABASE_URL: str
REDIS_URL: str
# Binance
BINANCE_API_KEY: str
BINANCE_API_SECRET: str
# Models
MODEL_PATH: str = "./data/models"
SUPPORTED_SYMBOLS: List[str] = ["BTCUSDT", "ETHUSDT"]
# Features
DEFAULT_HORIZONS: List[int] = [6, 18, 36, 72] # candles
# CORS
CORS_ORIGINS: List[str] = ["http://localhost:3000"]
class Config:
env_file = ".env"
settings = Settings()
Flujos de Datos
1. Flujo de Predicción
┌──────────┐ ┌──────────┐ ┌────────────┐ ┌──────────┐ ┌──────────┐
│ Client │───▶│ Router │───▶│ Service │───▶│ Features │───▶│ Model │
│ │ │ │ │ │ │ Builder │ │ │
└──────────┘ └──────────┘ └────────────┘ └──────────┘ └──────────┘
│ │
▼ │
┌────────────┐ │
│ Cache │◀─────────────────────────┘
│ (Redis) │
└────────────┘
Secuencia:
- Cliente solicita predicción vía API
- Router valida request y extrae parámetros
- Service verifica cache (Redis)
- Si no hay cache: Feature Builder genera features
- Model realiza predicción
- Resultado se cachea y retorna
2. Flujo de Señales
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Market Data │───▶│ Feature │───▶│ Signal │───▶│ Publisher │
│ Fetcher │ │ Builder │ │ Generator │ │ (Redis) │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Binance │ │ PostgreSQL │ │ Backend │
│ API │ │ (Logs) │ │ Express │
└──────────────┘ └──────────────┘ └──────────────┘
Comunicación Entre Servicios
Backend ↔ ML Engine
# Comunicación HTTP
Protocol: REST over HTTPS
Authentication: API Key (X-API-Key header)
Format: JSON
Timeout: 30 seconds
# Endpoints principales
POST /predictions # Solicitar predicción
POST /signals # Generar señal
GET /indicators # Obtener indicadores
GET /models/status # Estado de modelos
ML Engine ↔ Redis
# Cache de predicciones
Key pattern: "prediction:{symbol}:{horizon}:{timestamp}"
TTL: 60 seconds (1 candle de 5min)
# Publicación de señales
Channel: "signals:{symbol}"
Message format: JSON serialized Signal object
ML Engine ↔ PostgreSQL
-- Solo para métricas y logs
CREATE TABLE ml.prediction_logs (
id UUID PRIMARY KEY,
symbol VARCHAR(20),
horizon INTEGER,
predicted_high DECIMAL(20, 8),
predicted_low DECIMAL(20, 8),
actual_high DECIMAL(20, 8),
actual_low DECIMAL(20, 8),
mae DECIMAL(10, 6),
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE ml.signal_logs (
id UUID PRIMARY KEY,
symbol VARCHAR(20),
signal_type VARCHAR(10),
confidence DECIMAL(5, 4),
entry_price DECIMAL(20, 8),
outcome VARCHAR(10),
pnl_percent DECIMAL(10, 4),
created_at TIMESTAMPTZ DEFAULT NOW()
);
Concurrencia y Escalabilidad
Workers Configuration
# uvicorn config
workers = 4 # CPU cores
worker_class = "uvicorn.workers.UvicornWorker"
timeout = 60
keepalive = 5
# Thread pool for CPU-bound tasks
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=8)
Rate Limiting
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
# Limits
@router.get("/predictions")
@limiter.limit("100/minute")
async def get_prediction():
...
Caching Strategy
# Redis caching decorator
from functools import wraps
def cache_prediction(ttl: int = 60):
def decorator(func):
@wraps(func)
async def wrapper(symbol: str, horizon: int, *args, **kwargs):
cache_key = f"pred:{symbol}:{horizon}:{current_candle_time()}"
cached = await redis.get(cache_key)
if cached:
return json.loads(cached)
result = await func(symbol, horizon, *args, **kwargs)
await redis.setex(cache_key, ttl, json.dumps(result))
return result
return wrapper
return decorator
Manejo de Errores
Exception Hierarchy
# app/core/exceptions.py
class MLEngineError(Exception):
"""Base exception for ML Engine"""
pass
class ModelNotFoundError(MLEngineError):
"""Model file not found"""
pass
class ModelLoadError(MLEngineError):
"""Failed to load model"""
pass
class PredictionError(MLEngineError):
"""Error during prediction"""
pass
class MarketDataError(MLEngineError):
"""Error fetching market data"""
pass
class FeatureError(MLEngineError):
"""Error calculating features"""
pass
Error Handlers
# app/main.py
from fastapi import Request
from fastapi.responses import JSONResponse
@app.exception_handler(MLEngineError)
async def ml_exception_handler(request: Request, exc: MLEngineError):
return JSONResponse(
status_code=500,
content={
"error": exc.__class__.__name__,
"message": str(exc),
"path": str(request.url)
}
)
Monitoreo y Observabilidad
Health Checks
# app/api/routers/health.py
from fastapi import APIRouter, Depends
from app.services.model_manager import ModelManager
router = APIRouter()
@router.get("/health")
async def health_check():
return {"status": "healthy"}
@router.get("/health/detailed")
async def detailed_health(model_manager: ModelManager = Depends()):
return {
"status": "healthy",
"models": model_manager.get_status(),
"cache": await check_redis(),
"database": await check_postgres(),
"binance": await check_binance_connection()
}
Metrics (Prometheus)
from prometheus_client import Counter, Histogram, Gauge
# Counters
predictions_total = Counter(
'ml_predictions_total',
'Total predictions made',
['symbol', 'horizon']
)
# Histograms
prediction_latency = Histogram(
'ml_prediction_latency_seconds',
'Prediction latency',
buckets=[.01, .025, .05, .1, .25, .5, 1.0]
)
# Gauges
model_accuracy = Gauge(
'ml_model_accuracy',
'Current model accuracy',
['model_name']
)
Seguridad
API Key Authentication
# app/core/security.py
from fastapi import Security, HTTPException
from fastapi.security import APIKeyHeader
api_key_header = APIKeyHeader(name="X-API-Key")
async def validate_api_key(api_key: str = Security(api_key_header)):
if api_key != settings.API_KEY:
raise HTTPException(status_code=401, detail="Invalid API Key")
return api_key
Input Validation
# app/schemas/prediction.py
from pydantic import BaseModel, validator
from typing import Literal
class PredictionRequest(BaseModel):
symbol: str
horizon: int
@validator('symbol')
def validate_symbol(cls, v):
if v not in settings.SUPPORTED_SYMBOLS:
raise ValueError(f'Symbol {v} not supported')
return v
@validator('horizon')
def validate_horizon(cls, v):
if v not in settings.DEFAULT_HORIZONS:
raise ValueError(f'Horizon {v} not supported')
return v
Deployment
Docker Configuration
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN pip install --no-cache-dir poetry
COPY pyproject.toml poetry.lock ./
RUN poetry install --no-dev
COPY app/ ./app/
COPY data/models/ ./data/models/
EXPOSE 8000
CMD ["poetry", "run", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Docker Compose
# docker-compose.yml
version: '3.8'
services:
ml-engine:
build: .
ports:
- "8000:8000"
environment:
- DATABASE_URL=${DATABASE_URL}
- REDIS_URL=${REDIS_URL}
- API_KEY=${ML_API_KEY}
- BINANCE_API_KEY=${BINANCE_API_KEY}
- BINANCE_API_SECRET=${BINANCE_API_SECRET}
volumes:
- ./data/models:/app/data/models
depends_on:
- redis
restart: unless-stopped
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
volumes:
redis_data:
Referencias
- RF-ML-001: Predicción de Precios
- ET-ML-002: Modelos XGBoost
- ET-ML-003: Feature Engineering
- FastAPI Documentation
Autor: Requirements-Analyst Fecha: 2025-12-05