ML Engine Updates: - Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records - Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence) - Backtest results: +176.71R profit with aggressive_filter strategy Documentation Consolidation: - Created docs/99-analisis/_MAP.md index with 13 new analysis documents - Consolidated inventories: removed duplicates from orchestration/inventarios/ - Updated ML_INVENTORY.yml with BTCUSD metrics and training results - Added execution reports: FASE11-BTCUSD, correction issues, alignment validation Architecture & Integration: - Updated all module documentation with NEXUS v3.4 frontmatter - Fixed _MAP.md indexes across all folders - Updated orchestration plans and traces Files: 229 changed, 5064 insertions(+), 1872 deletions(-) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
623 lines
21 KiB
Markdown
623 lines
21 KiB
Markdown
---
|
|
id: "ET-ML-001"
|
|
title: "Arquitectura ML Engine"
|
|
type: "Technical Specification"
|
|
status: "Done"
|
|
priority: "Alta"
|
|
epic: "OQI-006"
|
|
project: "trading-platform"
|
|
version: "1.0.0"
|
|
created_date: "2025-12-05"
|
|
updated_date: "2026-01-04"
|
|
---
|
|
|
|
# ET-ML-001: Arquitectura ML Engine
|
|
|
|
## Metadata
|
|
|
|
| Campo | Valor |
|
|
|-------|-------|
|
|
| **ID** | ET-ML-001 |
|
|
| **Épica** | OQI-006 - Señales ML |
|
|
| **Tipo** | Especificación Técnica |
|
|
| **Versión** | 1.0.0 |
|
|
| **Estado** | Aprobado |
|
|
| **Última actualización** | 2025-12-05 |
|
|
|
|
---
|
|
|
|
## Propósito
|
|
|
|
Definir la arquitectura completa del ML Engine, incluyendo la estructura de servicios, comunicación entre componentes, y flujos de datos para predicciones y señales de trading.
|
|
|
|
---
|
|
|
|
## Arquitectura General
|
|
|
|
### Diagrama de Alto Nivel
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ ML ENGINE │
|
|
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ FastAPI Application │ │
|
|
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
|
|
│ │ │ Routers │ │ Services │ │ Background Tasks │ │ │
|
|
│ │ │ │ │ │ │ │ │ │
|
|
│ │ │ /predictions │ │ Predictor │ │ ┌─────────────────────┐ │ │ │
|
|
│ │ │ /signals │ │ SignalGen │ │ │ Market Data Fetcher │ │ │ │
|
|
│ │ │ /indicators │ │ Indicators │ │ └─────────────────────┘ │ │ │
|
|
│ │ │ /models │ │ ModelManager │ │ ┌─────────────────────┐ │ │ │
|
|
│ │ │ /health │ │ │ │ │ Training Pipeline │ │ │ │
|
|
│ │ └──────────────┘ └──────────────┘ │ └─────────────────────┘ │ │ │
|
|
│ │ │ ┌─────────────────────┐ │ │ │
|
|
│ │ ┌─────────────────────────────────┐ │ │ Signal Publisher │ │ │ │
|
|
│ │ │ Models Layer │ │ └─────────────────────┘ │ │ │
|
|
│ │ │ ┌─────────────┐ ┌───────────┐ │ └──────────────────────────┘ │ │
|
|
│ │ │ │ XGBoost │ │ Ensemble │ │ │ │
|
|
│ │ │ │ Models │ │ Manager │ │ │ │
|
|
│ │ │ └─────────────┘ └───────────┘ │ │ │
|
|
│ │ └─────────────────────────────────┘ │ │
|
|
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ Data Layer │ │
|
|
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
|
|
│ │ │ Redis │ │ PostgreSQL │ │ Model Storage │ │ │
|
|
│ │ │ Cache │ │ (Metrics) │ │ (File System) │ │ │
|
|
│ │ └──────────────┘ └──────────────┘ └──────────────────────────┘ │ │
|
|
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌───────────┐ ┌───────────┐ ┌───────────┐
|
|
│ Binance │ │ Backend │ │ Frontend │
|
|
│ API │ │ Express │ │ React │
|
|
└───────────┘ └───────────┘ └───────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Estructura de Directorios
|
|
|
|
```
|
|
apps/ml-engine/
|
|
├── app/
|
|
│ ├── __init__.py
|
|
│ ├── main.py # FastAPI entry point
|
|
│ ├── config/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── settings.py # Pydantic settings
|
|
│ │ └── logging.py # Logging configuration
|
|
│ ├── api/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── dependencies.py # Dependency injection
|
|
│ │ └── routers/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── predictions.py # /predictions endpoints
|
|
│ │ ├── signals.py # /signals endpoints
|
|
│ │ ├── indicators.py # /indicators endpoints
|
|
│ │ ├── models.py # /models endpoints
|
|
│ │ └── health.py # /health endpoints
|
|
│ ├── core/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── exceptions.py # Custom exceptions
|
|
│ │ ├── security.py # API key validation
|
|
│ │ └── middleware.py # Custom middleware
|
|
│ ├── services/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── predictor.py # Prediction service
|
|
│ │ ├── signal_generator.py # Signal generation
|
|
│ │ ├── indicator_calculator.py # Technical indicators
|
|
│ │ ├── model_manager.py # Model lifecycle
|
|
│ │ └── market_data.py # Data fetching
|
|
│ ├── models/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── base.py # Base model class
|
|
│ │ ├── range_predictor.py # Price range predictor
|
|
│ │ ├── tpsl_classifier.py # TP/SL classifier
|
|
│ │ └── signal_classifier.py # Signal classifier
|
|
│ ├── features/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── builder.py # Feature builder
|
|
│ │ ├── volatility.py # Volatility features
|
|
│ │ ├── momentum.py # Momentum features
|
|
│ │ ├── trend.py # Trend features
|
|
│ │ └── volume.py # Volume features
|
|
│ ├── schemas/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── prediction.py # Prediction DTOs
|
|
│ │ ├── signal.py # Signal DTOs
|
|
│ │ └── indicator.py # Indicator DTOs
|
|
│ └── tasks/
|
|
│ ├── __init__.py
|
|
│ ├── market_data_fetcher.py # Background data fetch
|
|
│ ├── training_pipeline.py # Model training
|
|
│ └── signal_publisher.py # Signal broadcast
|
|
├── data/
|
|
│ ├── models/ # Trained model files
|
|
│ │ ├── range_predictor/
|
|
│ │ ├── tpsl_classifier/
|
|
│ │ └── signal_classifier/
|
|
│ └── cache/ # Cached market data
|
|
├── tests/
|
|
│ ├── __init__.py
|
|
│ ├── conftest.py
|
|
│ ├── unit/
|
|
│ └── integration/
|
|
├── scripts/
|
|
│ ├── train_models.py
|
|
│ └── evaluate_models.py
|
|
├── requirements.txt
|
|
├── Dockerfile
|
|
└── docker-compose.yml
|
|
```
|
|
|
|
---
|
|
|
|
## Componentes Principales
|
|
|
|
### 1. FastAPI Application
|
|
|
|
```python
|
|
# app/main.py
|
|
from fastapi import FastAPI
|
|
from fastapi.middleware.cors import CORSMiddleware
|
|
from contextlib import asynccontextmanager
|
|
|
|
from app.config.settings import settings
|
|
from app.api.routers import predictions, signals, indicators, models, health
|
|
from app.tasks.market_data_fetcher import MarketDataFetcher
|
|
from app.services.model_manager import ModelManager
|
|
|
|
@asynccontextmanager
|
|
async def lifespan(app: FastAPI):
|
|
# Startup
|
|
model_manager = ModelManager()
|
|
await model_manager.load_all_models()
|
|
|
|
market_fetcher = MarketDataFetcher()
|
|
await market_fetcher.start()
|
|
|
|
yield
|
|
|
|
# Shutdown
|
|
await market_fetcher.stop()
|
|
|
|
app = FastAPI(
|
|
title="Trading Platform ML Engine",
|
|
version="1.0.0",
|
|
lifespan=lifespan
|
|
)
|
|
|
|
app.add_middleware(
|
|
CORSMiddleware,
|
|
allow_origins=settings.CORS_ORIGINS,
|
|
allow_methods=["*"],
|
|
allow_headers=["*"],
|
|
)
|
|
|
|
app.include_router(health.router, tags=["Health"])
|
|
app.include_router(predictions.router, prefix="/predictions", tags=["Predictions"])
|
|
app.include_router(signals.router, prefix="/signals", tags=["Signals"])
|
|
app.include_router(indicators.router, prefix="/indicators", tags=["Indicators"])
|
|
app.include_router(models.router, prefix="/models", tags=["Models"])
|
|
```
|
|
|
|
### 2. Configuration
|
|
|
|
```python
|
|
# app/config/settings.py
|
|
from pydantic_settings import BaseSettings
|
|
from typing import List
|
|
|
|
class Settings(BaseSettings):
|
|
# API
|
|
API_HOST: str = "0.0.0.0"
|
|
API_PORT: int = 8000
|
|
API_KEY: str
|
|
|
|
# Database
|
|
DATABASE_URL: str
|
|
REDIS_URL: str
|
|
|
|
# Binance
|
|
BINANCE_API_KEY: str
|
|
BINANCE_API_SECRET: str
|
|
|
|
# Models
|
|
MODEL_PATH: str = "./data/models"
|
|
SUPPORTED_SYMBOLS: List[str] = ["BTCUSDT", "ETHUSDT"]
|
|
|
|
# Features
|
|
DEFAULT_HORIZONS: List[int] = [6, 18, 36, 72] # candles
|
|
|
|
# CORS
|
|
CORS_ORIGINS: List[str] = ["http://localhost:3000"]
|
|
|
|
class Config:
|
|
env_file = ".env"
|
|
|
|
settings = Settings()
|
|
```
|
|
|
|
---
|
|
|
|
## Flujos de Datos
|
|
|
|
### 1. Flujo de Predicción
|
|
|
|
```
|
|
┌──────────┐ ┌──────────┐ ┌────────────┐ ┌──────────┐ ┌──────────┐
|
|
│ Client │───▶│ Router │───▶│ Service │───▶│ Features │───▶│ Model │
|
|
│ │ │ │ │ │ │ Builder │ │ │
|
|
└──────────┘ └──────────┘ └────────────┘ └──────────┘ └──────────┘
|
|
│ │
|
|
▼ │
|
|
┌────────────┐ │
|
|
│ Cache │◀─────────────────────────┘
|
|
│ (Redis) │
|
|
└────────────┘
|
|
```
|
|
|
|
**Secuencia:**
|
|
1. Cliente solicita predicción vía API
|
|
2. Router valida request y extrae parámetros
|
|
3. Service verifica cache (Redis)
|
|
4. Si no hay cache: Feature Builder genera features
|
|
5. Model realiza predicción
|
|
6. Resultado se cachea y retorna
|
|
|
|
### 2. Flujo de Señales
|
|
|
|
```
|
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
│ Market Data │───▶│ Feature │───▶│ Signal │───▶│ Publisher │
|
|
│ Fetcher │ │ Builder │ │ Generator │ │ (Redis) │
|
|
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
│ Binance │ │ PostgreSQL │ │ Backend │
|
|
│ API │ │ (Logs) │ │ Express │
|
|
└──────────────┘ └──────────────┘ └──────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Comunicación Entre Servicios
|
|
|
|
### Backend ↔ ML Engine
|
|
|
|
```yaml
|
|
# Comunicación HTTP
|
|
Protocol: REST over HTTPS
|
|
Authentication: API Key (X-API-Key header)
|
|
Format: JSON
|
|
Timeout: 30 seconds
|
|
|
|
# Endpoints principales
|
|
POST /predictions # Solicitar predicción
|
|
POST /signals # Generar señal
|
|
GET /indicators # Obtener indicadores
|
|
GET /models/status # Estado de modelos
|
|
```
|
|
|
|
### ML Engine ↔ Redis
|
|
|
|
```python
|
|
# Cache de predicciones
|
|
Key pattern: "prediction:{symbol}:{horizon}:{timestamp}"
|
|
TTL: 60 seconds (1 candle de 5min)
|
|
|
|
# Publicación de señales
|
|
Channel: "signals:{symbol}"
|
|
Message format: JSON serialized Signal object
|
|
```
|
|
|
|
### ML Engine ↔ PostgreSQL
|
|
|
|
```sql
|
|
-- Solo para métricas y logs
|
|
CREATE TABLE ml.prediction_logs (
|
|
id UUID PRIMARY KEY,
|
|
symbol VARCHAR(20),
|
|
horizon INTEGER,
|
|
predicted_high DECIMAL(20, 8),
|
|
predicted_low DECIMAL(20, 8),
|
|
actual_high DECIMAL(20, 8),
|
|
actual_low DECIMAL(20, 8),
|
|
mae DECIMAL(10, 6),
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
CREATE TABLE ml.signal_logs (
|
|
id UUID PRIMARY KEY,
|
|
symbol VARCHAR(20),
|
|
signal_type VARCHAR(10),
|
|
confidence DECIMAL(5, 4),
|
|
entry_price DECIMAL(20, 8),
|
|
outcome VARCHAR(10),
|
|
pnl_percent DECIMAL(10, 4),
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
---
|
|
|
|
## Concurrencia y Escalabilidad
|
|
|
|
### Workers Configuration
|
|
|
|
```python
|
|
# uvicorn config
|
|
workers = 4 # CPU cores
|
|
worker_class = "uvicorn.workers.UvicornWorker"
|
|
timeout = 60
|
|
keepalive = 5
|
|
|
|
# Thread pool for CPU-bound tasks
|
|
from concurrent.futures import ThreadPoolExecutor
|
|
executor = ThreadPoolExecutor(max_workers=8)
|
|
```
|
|
|
|
### Rate Limiting
|
|
|
|
```python
|
|
from slowapi import Limiter
|
|
from slowapi.util import get_remote_address
|
|
|
|
limiter = Limiter(key_func=get_remote_address)
|
|
|
|
# Limits
|
|
@router.get("/predictions")
|
|
@limiter.limit("100/minute")
|
|
async def get_prediction():
|
|
...
|
|
```
|
|
|
|
### Caching Strategy
|
|
|
|
```python
|
|
# Redis caching decorator
|
|
from functools import wraps
|
|
|
|
def cache_prediction(ttl: int = 60):
|
|
def decorator(func):
|
|
@wraps(func)
|
|
async def wrapper(symbol: str, horizon: int, *args, **kwargs):
|
|
cache_key = f"pred:{symbol}:{horizon}:{current_candle_time()}"
|
|
|
|
cached = await redis.get(cache_key)
|
|
if cached:
|
|
return json.loads(cached)
|
|
|
|
result = await func(symbol, horizon, *args, **kwargs)
|
|
await redis.setex(cache_key, ttl, json.dumps(result))
|
|
return result
|
|
return wrapper
|
|
return decorator
|
|
```
|
|
|
|
---
|
|
|
|
## Manejo de Errores
|
|
|
|
### Exception Hierarchy
|
|
|
|
```python
|
|
# app/core/exceptions.py
|
|
class MLEngineError(Exception):
|
|
"""Base exception for ML Engine"""
|
|
pass
|
|
|
|
class ModelNotFoundError(MLEngineError):
|
|
"""Model file not found"""
|
|
pass
|
|
|
|
class ModelLoadError(MLEngineError):
|
|
"""Failed to load model"""
|
|
pass
|
|
|
|
class PredictionError(MLEngineError):
|
|
"""Error during prediction"""
|
|
pass
|
|
|
|
class MarketDataError(MLEngineError):
|
|
"""Error fetching market data"""
|
|
pass
|
|
|
|
class FeatureError(MLEngineError):
|
|
"""Error calculating features"""
|
|
pass
|
|
```
|
|
|
|
### Error Handlers
|
|
|
|
```python
|
|
# app/main.py
|
|
from fastapi import Request
|
|
from fastapi.responses import JSONResponse
|
|
|
|
@app.exception_handler(MLEngineError)
|
|
async def ml_exception_handler(request: Request, exc: MLEngineError):
|
|
return JSONResponse(
|
|
status_code=500,
|
|
content={
|
|
"error": exc.__class__.__name__,
|
|
"message": str(exc),
|
|
"path": str(request.url)
|
|
}
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoreo y Observabilidad
|
|
|
|
### Health Checks
|
|
|
|
```python
|
|
# app/api/routers/health.py
|
|
from fastapi import APIRouter, Depends
|
|
from app.services.model_manager import ModelManager
|
|
|
|
router = APIRouter()
|
|
|
|
@router.get("/health")
|
|
async def health_check():
|
|
return {"status": "healthy"}
|
|
|
|
@router.get("/health/detailed")
|
|
async def detailed_health(model_manager: ModelManager = Depends()):
|
|
return {
|
|
"status": "healthy",
|
|
"models": model_manager.get_status(),
|
|
"cache": await check_redis(),
|
|
"database": await check_postgres(),
|
|
"binance": await check_binance_connection()
|
|
}
|
|
```
|
|
|
|
### Metrics (Prometheus)
|
|
|
|
```python
|
|
from prometheus_client import Counter, Histogram, Gauge
|
|
|
|
# Counters
|
|
predictions_total = Counter(
|
|
'ml_predictions_total',
|
|
'Total predictions made',
|
|
['symbol', 'horizon']
|
|
)
|
|
|
|
# Histograms
|
|
prediction_latency = Histogram(
|
|
'ml_prediction_latency_seconds',
|
|
'Prediction latency',
|
|
buckets=[.01, .025, .05, .1, .25, .5, 1.0]
|
|
)
|
|
|
|
# Gauges
|
|
model_accuracy = Gauge(
|
|
'ml_model_accuracy',
|
|
'Current model accuracy',
|
|
['model_name']
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Seguridad
|
|
|
|
### API Key Authentication
|
|
|
|
```python
|
|
# app/core/security.py
|
|
from fastapi import Security, HTTPException
|
|
from fastapi.security import APIKeyHeader
|
|
|
|
api_key_header = APIKeyHeader(name="X-API-Key")
|
|
|
|
async def validate_api_key(api_key: str = Security(api_key_header)):
|
|
if api_key != settings.API_KEY:
|
|
raise HTTPException(status_code=401, detail="Invalid API Key")
|
|
return api_key
|
|
```
|
|
|
|
### Input Validation
|
|
|
|
```python
|
|
# app/schemas/prediction.py
|
|
from pydantic import BaseModel, validator
|
|
from typing import Literal
|
|
|
|
class PredictionRequest(BaseModel):
|
|
symbol: str
|
|
horizon: int
|
|
|
|
@validator('symbol')
|
|
def validate_symbol(cls, v):
|
|
if v not in settings.SUPPORTED_SYMBOLS:
|
|
raise ValueError(f'Symbol {v} not supported')
|
|
return v
|
|
|
|
@validator('horizon')
|
|
def validate_horizon(cls, v):
|
|
if v not in settings.DEFAULT_HORIZONS:
|
|
raise ValueError(f'Horizon {v} not supported')
|
|
return v
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### Docker Configuration
|
|
|
|
```dockerfile
|
|
# Dockerfile
|
|
FROM python:3.11-slim
|
|
|
|
WORKDIR /app
|
|
|
|
RUN pip install --no-cache-dir poetry
|
|
COPY pyproject.toml poetry.lock ./
|
|
RUN poetry install --no-dev
|
|
|
|
COPY app/ ./app/
|
|
COPY data/models/ ./data/models/
|
|
|
|
EXPOSE 8000
|
|
|
|
CMD ["poetry", "run", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
|
```
|
|
|
|
### Docker Compose
|
|
|
|
```yaml
|
|
# docker-compose.yml
|
|
version: '3.8'
|
|
|
|
services:
|
|
ml-engine:
|
|
build: .
|
|
ports:
|
|
- "8000:8000"
|
|
environment:
|
|
- DATABASE_URL=${DATABASE_URL}
|
|
- REDIS_URL=${REDIS_URL}
|
|
- API_KEY=${ML_API_KEY}
|
|
- BINANCE_API_KEY=${BINANCE_API_KEY}
|
|
- BINANCE_API_SECRET=${BINANCE_API_SECRET}
|
|
volumes:
|
|
- ./data/models:/app/data/models
|
|
depends_on:
|
|
- redis
|
|
restart: unless-stopped
|
|
|
|
redis:
|
|
image: redis:7-alpine
|
|
ports:
|
|
- "6379:6379"
|
|
volumes:
|
|
- redis_data:/data
|
|
|
|
volumes:
|
|
redis_data:
|
|
```
|
|
|
|
---
|
|
|
|
## Referencias
|
|
|
|
- [RF-ML-001: Predicción de Precios](../requerimientos/RF-ML-001-predicciones.md)
|
|
- [ET-ML-002: Modelos XGBoost](./ET-ML-002-modelos.md)
|
|
- [ET-ML-003: Feature Engineering](./ET-ML-003-features.md)
|
|
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
|
|
|
|
---
|
|
|
|
**Autor:** Requirements-Analyst
|
|
**Fecha:** 2025-12-05
|