History

Adrian Flores Cortes 8f0235c096 [TASK-2026-02-06-ANALISIS-INTEGRAL-DOCUMENTACION] docs: Complete 6-phase documentation analysis - FASE-0: Diagnostic audit of 500+ files, 33 findings cataloged (7P0/8P1/12P2/6P3) - FASE-1: Resolved 7 P0 critical conflicts (ports, paths, dedup OQI-010/ADR-002, orphan schemas) - FASE-2: Resolved 8 P1 issues (traces, README/CLAUDE.md, DEPENDENCY-GRAPH v2.0, DDL drift, stack versions, DoR/DoD) - FASE-3: Resolved 12 P2 issues (archived tasks indexed, RNFs created, OQI-010 US/RF/ET, AGENTS v2.0) - FASE-4: Purged 3 obsolete docs to _archive/, fixed MODELO-NEGOCIO.md broken ref - FASE-5: Cross-layer validation (DDL→OQI 66%, OQI→BE 72%, BE→FE 78%, Inventories 95%) - FASE-6: INFORME-FINAL, SA-INDEX (18 subagents), METADATA COMPLETED 27/33 findings resolved (82%), 6 P3 deferred to backlog. 18 new files created, 40+ modified, 4 archived. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-02-06 10:57:03 -06:00
..
epicas	feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment	2026-01-07 09:31:29 -06:00
especificaciones	feat: Add comprehensive analysis and integration plan for trading-platform	2026-01-26 16:40:56 -06:00
estrategias	feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment	2026-01-07 09:31:29 -06:00
historias-usuario	[TASK-2026-01-25-FRONTEND-ANALYSIS] docs: Add frontend specifications and user stories	2026-01-25 01:47:27 -06:00
implementacion	docs: Add OQI-006 DATA-PIPELINE-SPEC.md and ML-TRAINING-ENHANCEMENT task docs	2026-01-25 14:32:37 -06:00
requerimientos	[TASK-2026-02-06-ANALISIS-INTEGRAL-DOCUMENTACION] docs: Complete 6-phase documentation analysis	2026-02-06 10:57:03 -06:00
_MAP.md	feat: Add comprehensive analysis and integration plan for trading-platform	2026-01-26 16:40:56 -06:00
README.md	[TASK-2026-02-06-ANALISIS-INTEGRAL-DOCUMENTACION] docs: Complete 6-phase documentation analysis	2026-02-06 10:57:03 -06:00

README.md

id	title	type	project	version	updated_date
README	Senales ML y Predicciones	Documentation	trading-platform	1.0.0	2026-02-06

OQI-006: Senales ML y Predicciones

Estado: ✅ Implementado Fecha: 2025-12-05 Modulo: apps/ml-services

Descripcion

Sistema de prediccion de precios basado en XGBoost que predice:

Precio maximo esperado en horizonte temporal
Precio minimo esperado en horizonte temporal
Nivel de confianza de la prediccion

Arquitectura

┌─────────────────┐     ┌─────────────────────────────────────────┐
│   Binance API   │────▶│          ML SERVICES (FastAPI)          │
│   (Market Data) │     │                Puerto 8000               │
└─────────────────┘     │                                          │
                        │  ┌──────────────┐  ┌──────────────┐     │
                        │  │ MarketData   │  │  XGBoost     │     │
                        │  │   Fetcher    │──│  Predictor   │     │
                        │  └──────────────┘  └──────────────┘     │
                        │         │                 │              │
                        │         ▼                 ▼              │
                        │  ┌──────────────┐  ┌──────────────┐     │
                        │  │   Feature    │  │   Training   │     │
                        │  │ Engineering  │  │   Pipeline   │     │
                        │  └──────────────┘  └──────────────┘     │
                        └─────────────────────────────────────────┘

Endpoints

Metodo	Ruta	Descripcion
GET	`/health`	Health check
GET	`/api/stats`	Estado del servicio
GET	`/api/predict/{symbol}`	Predicciones de precio
POST	`/api/train/{symbol}`	Entrenar modelo
GET	`/api/training/status`	Estado del entrenamiento
GET	`/api/signals/{symbol}`	Senales de trading
GET	`/api/indicators/{symbol}`	Indicadores tecnicos
WS	`/ws/{symbol}`	Predicciones en tiempo real

Modelo XGBoost

Configuracion

@dataclass
class ModelConfig:
    n_estimators: int = 100    # Numero de arboles
    max_depth: int = 6         # Profundidad maxima
    learning_rate: float = 0.1 # Tasa de aprendizaje
    subsample: float = 0.8     # Submuestra por arbol
    colsample_bytree: float = 0.8
    min_child_weight: int = 1
    random_state: int = 42

Features (30+)

Volatilidad:

volatility_5, volatility_10, volatility_20, volatility_50
atr_5, atr_10, atr_20, atr_50

Momentum:

momentum_5, momentum_10, momentum_20
roc_5, roc_10, roc_20

Medias Moviles:

sma_5, sma_10, sma_20, sma_50
ema_5, ema_10, ema_20, ema_50
sma_ratio_5, sma_ratio_10, sma_ratio_20, sma_ratio_50

Indicadores:

rsi_14 - Relative Strength Index
macd, macd_signal, macd_histogram
bb_position - Posicion en Bollinger Bands

Volumen:

volume_ratio - Ratio vs SMA 20

High/Low:

hl_range_pct - Rango high-low como %
high_distance, low_distance
hist_max_ratio_*, hist_min_ratio_*

Targets

El modelo predice:

max_ratio: Ratio del maximo futuro respecto al precio actual
```
max_ratio = future_high / current_price - 1
```
min_ratio: Ratio del minimo futuro respecto al precio actual
```
min_ratio = 1 - future_low / current_price
```

Horizontes de Prediccion

Horizonte	Candles (5min)	Tiempo	Uso
Scalping	6	30 min	Trading rapido
Intraday	18	90 min	Day trading
Swing	36	3 horas	Swing trading
Position	72	6 horas	Posiciones largas

Metricas de Entrenamiento

Metrica	Descripcion	Valor Tipico
high_mae	Error absoluto medio (high)	0.1% - 2%
high_rmse	Error cuadratico medio (high)	0.15% - 2.5%
low_mae	Error absoluto medio (low)	0.1% - 2%
low_rmse	Error cuadratico medio (low)	0.15% - 2.5%

Ejemplo de Prediccion

Request

curl http://localhost:8000/api/predict/BTCUSDT?horizon=all

Response

{
  "symbol": "BTCUSDT",
  "timestamp": "2025-12-05T18:05:08.889327Z",
  "current_price": 89388.99,
  "predictions": {
    "scalping": {
      "high": 89663.86,
      "low": 88930.53,
      "high_ratio": 1.0031,
      "low_ratio": 0.9949,
      "confidence": 0.69,
      "minutes": 30
    },
    "intraday": {
      "high": 90213.60,
      "low": 88013.61,
      "high_ratio": 1.0093,
      "low_ratio": 0.9848,
      "confidence": 0.59,
      "minutes": 90
    },
    "swing": {
      "high": 91038.21,
      "low": 86638.23,
      "high_ratio": 1.0187,
      "low_ratio": 0.9698,
      "confidence": 0.45,
      "minutes": 180
    },
    "position": {
      "high": 92687.43,
      "low": 83887.47,
      "high_ratio": 1.0378,
      "low_ratio": 0.9405,
      "confidence": 0.45,
      "minutes": 360
    }
  },
  "model_version": "1.0.0",
  "is_trained": true
}

Entrenamiento

Iniciar Entrenamiento

curl -X POST "http://localhost:8000/api/train/BTCUSDT?samples=500"

Respuesta

{
  "status": "training_started",
  "symbol": "BTCUSDT",
  "samples": 500,
  "message": "Model training started in background. Check /api/stats for progress."
}

Verificar Estado

curl http://localhost:8000/api/training/status

{
  "training_in_progress": false,
  "is_trained": true,
  "last_training": {
    "symbol": "BTCUSDT",
    "timestamp": "2025-12-05T18:04:49.757994",
    "samples": 500,
    "metrics": {
      "high_mae": 0.00099,
      "high_rmse": 0.00141,
      "low_mae": 0.00173,
      "low_rmse": 0.00284,
      "train_samples": 355,
      "test_samples": 89
    }
  }
}

Market Data

Fuentes de Datos

Fuente	Uso	API
Binance	Crypto (BTC, ETH)	REST + WebSocket
Mock Data	Testing	Generado localmente

OHLCV Structure

@dataclass
class OHLCV:
    timestamp: np.ndarray  # Epoch milliseconds
    open: np.ndarray       # Precio apertura
    high: np.ndarray       # Precio maximo
    low: np.ndarray        # Precio minimo
    close: np.ndarray      # Precio cierre
    volume: np.ndarray     # Volumen

Archivos

apps/ml-services/
├── src/
│   ├── api/
│   │   ├── server.py          # FastAPI app
│   │   └── schemas/
│   │       └── prediction.py  # Pydantic schemas
│   ├── data/
│   │   └── market_data.py     # MarketDataFetcher
│   └── models/
│       ├── xgboost_model.py   # XGBoostPredictor
│       ├── predictor.py       # MaxMinPricePredictor
│       └── indicators.py      # Indicadores tecnicos
├── trained_models/            # Modelos guardados
│   ├── xgb_high.json
│   └── xgb_low.json
└── environment-cpu.yml        # Conda environment

Dependencias

# Principales
- python=3.11
- fastapi=0.115
- uvicorn
- xgboost
- scikit-learn
- pandas
- numpy
- loguru
- aiohttp
- requests

Configuracion

Variables de Entorno

No requiere variables de entorno obligatorias.

Opcionales:

ML_MODEL_PATH=./trained_models
BINANCE_API_KEY=xxx  # Opcional para rate limits

Iniciar Servidor

cd apps/ml-services
conda activate trading-ml
uvicorn src.api.server:app --host 0.0.0.0 --port 8000 --reload

Limitaciones

Simbolos soportados: Solo BTCUSDT y ETHUSDT para training
Horizonte maximo: 6 horas (72 candles de 5min)
Rate limits Binance: 1200 requests/min
Precision: MAE tipico de 0.1% a 2%

Proximas Mejoras

Modelo GRU para patrones secuenciales
Ensemble XGBoost + GRU
Soporte para mas simbolos (XAU, EUR)
Predicciones a nivel de ticks
AutoML para optimizacion de hiperparametros

Schemas DDL Asignados

Este modulo es owner del siguiente schema DDL:

Schema	Tablas	Descripcion
ml	12	models, model_versions, predictions, signals, signal_subscriptions, backtests, backtest_results, feature_sets, training_jobs, ensemble_models, ensemble_predictions, model_metrics

Total tablas: 12 Nota DDL drift: Documentacion previa no incluia seccion de schemas DDL. Las 12 tablas cubren el ciclo completo de ML: entrenamiento (models, model_versions, training_jobs, feature_sets), prediccion (predictions, signals, signal_subscriptions), evaluacion (backtests, backtest_results, model_metrics) y ensemble (ensemble_models, ensemble_predictions). Actualizado por TASK-2026-02-06 F2.6.

Documentacion generada: 2025-12-05