trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals
Adrian Flores Cortes 8f0235c096 [TASK-2026-02-06-ANALISIS-INTEGRAL-DOCUMENTACION] docs: Complete 6-phase documentation analysis
- FASE-0: Diagnostic audit of 500+ files, 33 findings cataloged (7P0/8P1/12P2/6P3)
- FASE-1: Resolved 7 P0 critical conflicts (ports, paths, dedup OQI-010/ADR-002, orphan schemas)
- FASE-2: Resolved 8 P1 issues (traces, README/CLAUDE.md, DEPENDENCY-GRAPH v2.0, DDL drift, stack versions, DoR/DoD)
- FASE-3: Resolved 12 P2 issues (archived tasks indexed, RNFs created, OQI-010 US/RF/ET, AGENTS v2.0)
- FASE-4: Purged 3 obsolete docs to _archive/, fixed MODELO-NEGOCIO.md broken ref
- FASE-5: Cross-layer validation (DDL→OQI 66%, OQI→BE 72%, BE→FE 78%, Inventories 95%)
- FASE-6: INFORME-FINAL, SA-INDEX (18 subagents), METADATA COMPLETED

27/33 findings resolved (82%), 6 P3 deferred to backlog.
18 new files created, 40+ modified, 4 archived.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 10:57:03 -06:00
..
epicas feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment 2026-01-07 09:31:29 -06:00
especificaciones feat: Add comprehensive analysis and integration plan for trading-platform 2026-01-26 16:40:56 -06:00
estrategias feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment 2026-01-07 09:31:29 -06:00
historias-usuario [TASK-2026-01-25-FRONTEND-ANALYSIS] docs: Add frontend specifications and user stories 2026-01-25 01:47:27 -06:00
implementacion docs: Add OQI-006 DATA-PIPELINE-SPEC.md and ML-TRAINING-ENHANCEMENT task docs 2026-01-25 14:32:37 -06:00
requerimientos [TASK-2026-02-06-ANALISIS-INTEGRAL-DOCUMENTACION] docs: Complete 6-phase documentation analysis 2026-02-06 10:57:03 -06:00
_MAP.md feat: Add comprehensive analysis and integration plan for trading-platform 2026-01-26 16:40:56 -06:00
README.md [TASK-2026-02-06-ANALISIS-INTEGRAL-DOCUMENTACION] docs: Complete 6-phase documentation analysis 2026-02-06 10:57:03 -06:00

id title type project version updated_date
README Senales ML y Predicciones Documentation trading-platform 1.0.0 2026-02-06

OQI-006: Senales ML y Predicciones

Estado: Implementado Fecha: 2025-12-05 Modulo: apps/ml-services


Descripcion

Sistema de prediccion de precios basado en XGBoost que predice:

  • Precio maximo esperado en horizonte temporal
  • Precio minimo esperado en horizonte temporal
  • Nivel de confianza de la prediccion

Arquitectura

┌─────────────────┐     ┌─────────────────────────────────────────┐
│   Binance API   │────▶│          ML SERVICES (FastAPI)          │
│   (Market Data) │     │                Puerto 8000               │
└─────────────────┘     │                                          │
                        │  ┌──────────────┐  ┌──────────────┐     │
                        │  │ MarketData   │  │  XGBoost     │     │
                        │  │   Fetcher    │──│  Predictor   │     │
                        │  └──────────────┘  └──────────────┘     │
                        │         │                 │              │
                        │         ▼                 ▼              │
                        │  ┌──────────────┐  ┌──────────────┐     │
                        │  │   Feature    │  │   Training   │     │
                        │  │ Engineering  │  │   Pipeline   │     │
                        │  └──────────────┘  └──────────────┘     │
                        └─────────────────────────────────────────┘

Endpoints

Metodo Ruta Descripcion
GET /health Health check
GET /api/stats Estado del servicio
GET /api/predict/{symbol} Predicciones de precio
POST /api/train/{symbol} Entrenar modelo
GET /api/training/status Estado del entrenamiento
GET /api/signals/{symbol} Senales de trading
GET /api/indicators/{symbol} Indicadores tecnicos
WS /ws/{symbol} Predicciones en tiempo real

Modelo XGBoost

Configuracion

@dataclass
class ModelConfig:
    n_estimators: int = 100    # Numero de arboles
    max_depth: int = 6         # Profundidad maxima
    learning_rate: float = 0.1 # Tasa de aprendizaje
    subsample: float = 0.8     # Submuestra por arbol
    colsample_bytree: float = 0.8
    min_child_weight: int = 1
    random_state: int = 42

Features (30+)

Volatilidad:

  • volatility_5, volatility_10, volatility_20, volatility_50
  • atr_5, atr_10, atr_20, atr_50

Momentum:

  • momentum_5, momentum_10, momentum_20
  • roc_5, roc_10, roc_20

Medias Moviles:

  • sma_5, sma_10, sma_20, sma_50
  • ema_5, ema_10, ema_20, ema_50
  • sma_ratio_5, sma_ratio_10, sma_ratio_20, sma_ratio_50

Indicadores:

  • rsi_14 - Relative Strength Index
  • macd, macd_signal, macd_histogram
  • bb_position - Posicion en Bollinger Bands

Volumen:

  • volume_ratio - Ratio vs SMA 20

High/Low:

  • hl_range_pct - Rango high-low como %
  • high_distance, low_distance
  • hist_max_ratio_*, hist_min_ratio_*

Targets

El modelo predice:

  1. max_ratio: Ratio del maximo futuro respecto al precio actual

    max_ratio = future_high / current_price - 1
    
  2. min_ratio: Ratio del minimo futuro respecto al precio actual

    min_ratio = 1 - future_low / current_price
    

Horizontes de Prediccion

Horizonte Candles (5min) Tiempo Uso
Scalping 6 30 min Trading rapido
Intraday 18 90 min Day trading
Swing 36 3 horas Swing trading
Position 72 6 horas Posiciones largas

Metricas de Entrenamiento

Metrica Descripcion Valor Tipico
high_mae Error absoluto medio (high) 0.1% - 2%
high_rmse Error cuadratico medio (high) 0.15% - 2.5%
low_mae Error absoluto medio (low) 0.1% - 2%
low_rmse Error cuadratico medio (low) 0.15% - 2.5%

Ejemplo de Prediccion

Request

curl http://localhost:8000/api/predict/BTCUSDT?horizon=all

Response

{
  "symbol": "BTCUSDT",
  "timestamp": "2025-12-05T18:05:08.889327Z",
  "current_price": 89388.99,
  "predictions": {
    "scalping": {
      "high": 89663.86,
      "low": 88930.53,
      "high_ratio": 1.0031,
      "low_ratio": 0.9949,
      "confidence": 0.69,
      "minutes": 30
    },
    "intraday": {
      "high": 90213.60,
      "low": 88013.61,
      "high_ratio": 1.0093,
      "low_ratio": 0.9848,
      "confidence": 0.59,
      "minutes": 90
    },
    "swing": {
      "high": 91038.21,
      "low": 86638.23,
      "high_ratio": 1.0187,
      "low_ratio": 0.9698,
      "confidence": 0.45,
      "minutes": 180
    },
    "position": {
      "high": 92687.43,
      "low": 83887.47,
      "high_ratio": 1.0378,
      "low_ratio": 0.9405,
      "confidence": 0.45,
      "minutes": 360
    }
  },
  "model_version": "1.0.0",
  "is_trained": true
}

Entrenamiento

Iniciar Entrenamiento

curl -X POST "http://localhost:8000/api/train/BTCUSDT?samples=500"

Respuesta

{
  "status": "training_started",
  "symbol": "BTCUSDT",
  "samples": 500,
  "message": "Model training started in background. Check /api/stats for progress."
}

Verificar Estado

curl http://localhost:8000/api/training/status
{
  "training_in_progress": false,
  "is_trained": true,
  "last_training": {
    "symbol": "BTCUSDT",
    "timestamp": "2025-12-05T18:04:49.757994",
    "samples": 500,
    "metrics": {
      "high_mae": 0.00099,
      "high_rmse": 0.00141,
      "low_mae": 0.00173,
      "low_rmse": 0.00284,
      "train_samples": 355,
      "test_samples": 89
    }
  }
}

Market Data

Fuentes de Datos

Fuente Uso API
Binance Crypto (BTC, ETH) REST + WebSocket
Mock Data Testing Generado localmente

OHLCV Structure

@dataclass
class OHLCV:
    timestamp: np.ndarray  # Epoch milliseconds
    open: np.ndarray       # Precio apertura
    high: np.ndarray       # Precio maximo
    low: np.ndarray        # Precio minimo
    close: np.ndarray      # Precio cierre
    volume: np.ndarray     # Volumen

Archivos

apps/ml-services/
├── src/
│   ├── api/
│   │   ├── server.py          # FastAPI app
│   │   └── schemas/
│   │       └── prediction.py  # Pydantic schemas
│   ├── data/
│   │   └── market_data.py     # MarketDataFetcher
│   └── models/
│       ├── xgboost_model.py   # XGBoostPredictor
│       ├── predictor.py       # MaxMinPricePredictor
│       └── indicators.py      # Indicadores tecnicos
├── trained_models/            # Modelos guardados
│   ├── xgb_high.json
│   └── xgb_low.json
└── environment-cpu.yml        # Conda environment

Dependencias

# Principales
- python=3.11
- fastapi=0.115
- uvicorn
- xgboost
- scikit-learn
- pandas
- numpy
- loguru
- aiohttp
- requests

Configuracion

Variables de Entorno

No requiere variables de entorno obligatorias.

Opcionales:

ML_MODEL_PATH=./trained_models
BINANCE_API_KEY=xxx  # Opcional para rate limits

Iniciar Servidor

cd apps/ml-services
conda activate trading-ml
uvicorn src.api.server:app --host 0.0.0.0 --port 8000 --reload

Limitaciones

  1. Simbolos soportados: Solo BTCUSDT y ETHUSDT para training
  2. Horizonte maximo: 6 horas (72 candles de 5min)
  3. Rate limits Binance: 1200 requests/min
  4. Precision: MAE tipico de 0.1% a 2%

Proximas Mejoras

  • Modelo GRU para patrones secuenciales
  • Ensemble XGBoost + GRU
  • Soporte para mas simbolos (XAU, EUR)
  • Predicciones a nivel de ticks
  • AutoML para optimizacion de hiperparametros

Schemas DDL Asignados

Este modulo es owner del siguiente schema DDL:

Schema Tablas Descripcion
ml 12 models, model_versions, predictions, signals, signal_subscriptions, backtests, backtest_results, feature_sets, training_jobs, ensemble_models, ensemble_predictions, model_metrics

Total tablas: 12 Nota DDL drift: Documentacion previa no incluia seccion de schemas DDL. Las 12 tablas cubren el ciclo completo de ML: entrenamiento (models, model_versions, training_jobs, feature_sets), prediccion (predictions, signals, signal_subscriptions), evaluacion (backtests, backtest_results, model_metrics) y ensemble (ensemble_models, ensemble_predictions). Actualizado por TASK-2026-02-06 F2.6.


Documentacion generada: 2025-12-05