trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/README.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

8.5 KiB

id title type project version updated_date
README Senales ML y Predicciones Documentation trading-platform 1.0.0 2026-01-04

OQI-006: Senales ML y Predicciones

Estado: Implementado Fecha: 2025-12-05 Modulo: apps/ml-services


Descripcion

Sistema de prediccion de precios basado en XGBoost que predice:

  • Precio maximo esperado en horizonte temporal
  • Precio minimo esperado en horizonte temporal
  • Nivel de confianza de la prediccion

Arquitectura

┌─────────────────┐     ┌─────────────────────────────────────────┐
│   Binance API   │────▶│          ML SERVICES (FastAPI)          │
│   (Market Data) │     │                Puerto 8000               │
└─────────────────┘     │                                          │
                        │  ┌──────────────┐  ┌──────────────┐     │
                        │  │ MarketData   │  │  XGBoost     │     │
                        │  │   Fetcher    │──│  Predictor   │     │
                        │  └──────────────┘  └──────────────┘     │
                        │         │                 │              │
                        │         ▼                 ▼              │
                        │  ┌──────────────┐  ┌──────────────┐     │
                        │  │   Feature    │  │   Training   │     │
                        │  │ Engineering  │  │   Pipeline   │     │
                        │  └──────────────┘  └──────────────┘     │
                        └─────────────────────────────────────────┘

Endpoints

Metodo Ruta Descripcion
GET /health Health check
GET /api/stats Estado del servicio
GET /api/predict/{symbol} Predicciones de precio
POST /api/train/{symbol} Entrenar modelo
GET /api/training/status Estado del entrenamiento
GET /api/signals/{symbol} Senales de trading
GET /api/indicators/{symbol} Indicadores tecnicos
WS /ws/{symbol} Predicciones en tiempo real

Modelo XGBoost

Configuracion

@dataclass
class ModelConfig:
    n_estimators: int = 100    # Numero de arboles
    max_depth: int = 6         # Profundidad maxima
    learning_rate: float = 0.1 # Tasa de aprendizaje
    subsample: float = 0.8     # Submuestra por arbol
    colsample_bytree: float = 0.8
    min_child_weight: int = 1
    random_state: int = 42

Features (30+)

Volatilidad:

  • volatility_5, volatility_10, volatility_20, volatility_50
  • atr_5, atr_10, atr_20, atr_50

Momentum:

  • momentum_5, momentum_10, momentum_20
  • roc_5, roc_10, roc_20

Medias Moviles:

  • sma_5, sma_10, sma_20, sma_50
  • ema_5, ema_10, ema_20, ema_50
  • sma_ratio_5, sma_ratio_10, sma_ratio_20, sma_ratio_50

Indicadores:

  • rsi_14 - Relative Strength Index
  • macd, macd_signal, macd_histogram
  • bb_position - Posicion en Bollinger Bands

Volumen:

  • volume_ratio - Ratio vs SMA 20

High/Low:

  • hl_range_pct - Rango high-low como %
  • high_distance, low_distance
  • hist_max_ratio_*, hist_min_ratio_*

Targets

El modelo predice:

  1. max_ratio: Ratio del maximo futuro respecto al precio actual

    max_ratio = future_high / current_price - 1
    
  2. min_ratio: Ratio del minimo futuro respecto al precio actual

    min_ratio = 1 - future_low / current_price
    

Horizontes de Prediccion

Horizonte Candles (5min) Tiempo Uso
Scalping 6 30 min Trading rapido
Intraday 18 90 min Day trading
Swing 36 3 horas Swing trading
Position 72 6 horas Posiciones largas

Metricas de Entrenamiento

Metrica Descripcion Valor Tipico
high_mae Error absoluto medio (high) 0.1% - 2%
high_rmse Error cuadratico medio (high) 0.15% - 2.5%
low_mae Error absoluto medio (low) 0.1% - 2%
low_rmse Error cuadratico medio (low) 0.15% - 2.5%

Ejemplo de Prediccion

Request

curl http://localhost:8000/api/predict/BTCUSDT?horizon=all

Response

{
  "symbol": "BTCUSDT",
  "timestamp": "2025-12-05T18:05:08.889327Z",
  "current_price": 89388.99,
  "predictions": {
    "scalping": {
      "high": 89663.86,
      "low": 88930.53,
      "high_ratio": 1.0031,
      "low_ratio": 0.9949,
      "confidence": 0.69,
      "minutes": 30
    },
    "intraday": {
      "high": 90213.60,
      "low": 88013.61,
      "high_ratio": 1.0093,
      "low_ratio": 0.9848,
      "confidence": 0.59,
      "minutes": 90
    },
    "swing": {
      "high": 91038.21,
      "low": 86638.23,
      "high_ratio": 1.0187,
      "low_ratio": 0.9698,
      "confidence": 0.45,
      "minutes": 180
    },
    "position": {
      "high": 92687.43,
      "low": 83887.47,
      "high_ratio": 1.0378,
      "low_ratio": 0.9405,
      "confidence": 0.45,
      "minutes": 360
    }
  },
  "model_version": "1.0.0",
  "is_trained": true
}

Entrenamiento

Iniciar Entrenamiento

curl -X POST "http://localhost:8000/api/train/BTCUSDT?samples=500"

Respuesta

{
  "status": "training_started",
  "symbol": "BTCUSDT",
  "samples": 500,
  "message": "Model training started in background. Check /api/stats for progress."
}

Verificar Estado

curl http://localhost:8000/api/training/status
{
  "training_in_progress": false,
  "is_trained": true,
  "last_training": {
    "symbol": "BTCUSDT",
    "timestamp": "2025-12-05T18:04:49.757994",
    "samples": 500,
    "metrics": {
      "high_mae": 0.00099,
      "high_rmse": 0.00141,
      "low_mae": 0.00173,
      "low_rmse": 0.00284,
      "train_samples": 355,
      "test_samples": 89
    }
  }
}

Market Data

Fuentes de Datos

Fuente Uso API
Binance Crypto (BTC, ETH) REST + WebSocket
Mock Data Testing Generado localmente

OHLCV Structure

@dataclass
class OHLCV:
    timestamp: np.ndarray  # Epoch milliseconds
    open: np.ndarray       # Precio apertura
    high: np.ndarray       # Precio maximo
    low: np.ndarray        # Precio minimo
    close: np.ndarray      # Precio cierre
    volume: np.ndarray     # Volumen

Archivos

apps/ml-services/
├── src/
│   ├── api/
│   │   ├── server.py          # FastAPI app
│   │   └── schemas/
│   │       └── prediction.py  # Pydantic schemas
│   ├── data/
│   │   └── market_data.py     # MarketDataFetcher
│   └── models/
│       ├── xgboost_model.py   # XGBoostPredictor
│       ├── predictor.py       # MaxMinPricePredictor
│       └── indicators.py      # Indicadores tecnicos
├── trained_models/            # Modelos guardados
│   ├── xgb_high.json
│   └── xgb_low.json
└── environment-cpu.yml        # Conda environment

Dependencias

# Principales
- python=3.11
- fastapi=0.115
- uvicorn
- xgboost
- scikit-learn
- pandas
- numpy
- loguru
- aiohttp
- requests

Configuracion

Variables de Entorno

No requiere variables de entorno obligatorias.

Opcionales:

ML_MODEL_PATH=./trained_models
BINANCE_API_KEY=xxx  # Opcional para rate limits

Iniciar Servidor

cd apps/ml-services
conda activate trading-ml
uvicorn src.api.server:app --host 0.0.0.0 --port 8000 --reload

Limitaciones

  1. Simbolos soportados: Solo BTCUSDT y ETHUSDT para training
  2. Horizonte maximo: 6 horas (72 candles de 5min)
  3. Rate limits Binance: 1200 requests/min
  4. Precision: MAE tipico de 0.1% a 2%

Proximas Mejoras

  • Modelo GRU para patrones secuenciales
  • Ensemble XGBoost + GRU
  • Soporte para mas simbolos (XAU, EUR)
  • Predicciones a nivel de ticks
  • AutoML para optimizacion de hiperparametros

Documentacion generada: 2025-12-05