- FASE-0: Diagnostic audit of 500+ files, 33 findings cataloged (7P0/8P1/12P2/6P3) - FASE-1: Resolved 7 P0 critical conflicts (ports, paths, dedup OQI-010/ADR-002, orphan schemas) - FASE-2: Resolved 8 P1 issues (traces, README/CLAUDE.md, DEPENDENCY-GRAPH v2.0, DDL drift, stack versions, DoR/DoD) - FASE-3: Resolved 12 P2 issues (archived tasks indexed, RNFs created, OQI-010 US/RF/ET, AGENTS v2.0) - FASE-4: Purged 3 obsolete docs to _archive/, fixed MODELO-NEGOCIO.md broken ref - FASE-5: Cross-layer validation (DDL→OQI 66%, OQI→BE 72%, BE→FE 78%, Inventories 95%) - FASE-6: INFORME-FINAL, SA-INDEX (18 subagents), METADATA COMPLETED 27/33 findings resolved (82%), 6 P3 deferred to backlog. 18 new files created, 40+ modified, 4 archived. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9.2 KiB
| id | title | type | project | version | updated_date |
|---|---|---|---|---|---|
| README | Senales ML y Predicciones | Documentation | trading-platform | 1.0.0 | 2026-02-06 |
OQI-006: Senales ML y Predicciones
Estado: ✅ Implementado
Fecha: 2025-12-05
Modulo: apps/ml-services
Descripcion
Sistema de prediccion de precios basado en XGBoost que predice:
- Precio maximo esperado en horizonte temporal
- Precio minimo esperado en horizonte temporal
- Nivel de confianza de la prediccion
Arquitectura
┌─────────────────┐ ┌─────────────────────────────────────────┐
│ Binance API │────▶│ ML SERVICES (FastAPI) │
│ (Market Data) │ │ Puerto 8000 │
└─────────────────┘ │ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ MarketData │ │ XGBoost │ │
│ │ Fetcher │──│ Predictor │ │
│ └──────────────┘ └──────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Feature │ │ Training │ │
│ │ Engineering │ │ Pipeline │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────┘
Endpoints
| Metodo | Ruta | Descripcion |
|---|---|---|
| GET | /health |
Health check |
| GET | /api/stats |
Estado del servicio |
| GET | /api/predict/{symbol} |
Predicciones de precio |
| POST | /api/train/{symbol} |
Entrenar modelo |
| GET | /api/training/status |
Estado del entrenamiento |
| GET | /api/signals/{symbol} |
Senales de trading |
| GET | /api/indicators/{symbol} |
Indicadores tecnicos |
| WS | /ws/{symbol} |
Predicciones en tiempo real |
Modelo XGBoost
Configuracion
@dataclass
class ModelConfig:
n_estimators: int = 100 # Numero de arboles
max_depth: int = 6 # Profundidad maxima
learning_rate: float = 0.1 # Tasa de aprendizaje
subsample: float = 0.8 # Submuestra por arbol
colsample_bytree: float = 0.8
min_child_weight: int = 1
random_state: int = 42
Features (30+)
Volatilidad:
volatility_5,volatility_10,volatility_20,volatility_50atr_5,atr_10,atr_20,atr_50
Momentum:
momentum_5,momentum_10,momentum_20roc_5,roc_10,roc_20
Medias Moviles:
sma_5,sma_10,sma_20,sma_50ema_5,ema_10,ema_20,ema_50sma_ratio_5,sma_ratio_10,sma_ratio_20,sma_ratio_50
Indicadores:
rsi_14- Relative Strength Indexmacd,macd_signal,macd_histogrambb_position- Posicion en Bollinger Bands
Volumen:
volume_ratio- Ratio vs SMA 20
High/Low:
hl_range_pct- Rango high-low como %high_distance,low_distancehist_max_ratio_*,hist_min_ratio_*
Targets
El modelo predice:
-
max_ratio: Ratio del maximo futuro respecto al precio actual
max_ratio = future_high / current_price - 1 -
min_ratio: Ratio del minimo futuro respecto al precio actual
min_ratio = 1 - future_low / current_price
Horizontes de Prediccion
| Horizonte | Candles (5min) | Tiempo | Uso |
|---|---|---|---|
| Scalping | 6 | 30 min | Trading rapido |
| Intraday | 18 | 90 min | Day trading |
| Swing | 36 | 3 horas | Swing trading |
| Position | 72 | 6 horas | Posiciones largas |
Metricas de Entrenamiento
| Metrica | Descripcion | Valor Tipico |
|---|---|---|
| high_mae | Error absoluto medio (high) | 0.1% - 2% |
| high_rmse | Error cuadratico medio (high) | 0.15% - 2.5% |
| low_mae | Error absoluto medio (low) | 0.1% - 2% |
| low_rmse | Error cuadratico medio (low) | 0.15% - 2.5% |
Ejemplo de Prediccion
Request
curl http://localhost:8000/api/predict/BTCUSDT?horizon=all
Response
{
"symbol": "BTCUSDT",
"timestamp": "2025-12-05T18:05:08.889327Z",
"current_price": 89388.99,
"predictions": {
"scalping": {
"high": 89663.86,
"low": 88930.53,
"high_ratio": 1.0031,
"low_ratio": 0.9949,
"confidence": 0.69,
"minutes": 30
},
"intraday": {
"high": 90213.60,
"low": 88013.61,
"high_ratio": 1.0093,
"low_ratio": 0.9848,
"confidence": 0.59,
"minutes": 90
},
"swing": {
"high": 91038.21,
"low": 86638.23,
"high_ratio": 1.0187,
"low_ratio": 0.9698,
"confidence": 0.45,
"minutes": 180
},
"position": {
"high": 92687.43,
"low": 83887.47,
"high_ratio": 1.0378,
"low_ratio": 0.9405,
"confidence": 0.45,
"minutes": 360
}
},
"model_version": "1.0.0",
"is_trained": true
}
Entrenamiento
Iniciar Entrenamiento
curl -X POST "http://localhost:8000/api/train/BTCUSDT?samples=500"
Respuesta
{
"status": "training_started",
"symbol": "BTCUSDT",
"samples": 500,
"message": "Model training started in background. Check /api/stats for progress."
}
Verificar Estado
curl http://localhost:8000/api/training/status
{
"training_in_progress": false,
"is_trained": true,
"last_training": {
"symbol": "BTCUSDT",
"timestamp": "2025-12-05T18:04:49.757994",
"samples": 500,
"metrics": {
"high_mae": 0.00099,
"high_rmse": 0.00141,
"low_mae": 0.00173,
"low_rmse": 0.00284,
"train_samples": 355,
"test_samples": 89
}
}
}
Market Data
Fuentes de Datos
| Fuente | Uso | API |
|---|---|---|
| Binance | Crypto (BTC, ETH) | REST + WebSocket |
| Mock Data | Testing | Generado localmente |
OHLCV Structure
@dataclass
class OHLCV:
timestamp: np.ndarray # Epoch milliseconds
open: np.ndarray # Precio apertura
high: np.ndarray # Precio maximo
low: np.ndarray # Precio minimo
close: np.ndarray # Precio cierre
volume: np.ndarray # Volumen
Archivos
apps/ml-services/
├── src/
│ ├── api/
│ │ ├── server.py # FastAPI app
│ │ └── schemas/
│ │ └── prediction.py # Pydantic schemas
│ ├── data/
│ │ └── market_data.py # MarketDataFetcher
│ └── models/
│ ├── xgboost_model.py # XGBoostPredictor
│ ├── predictor.py # MaxMinPricePredictor
│ └── indicators.py # Indicadores tecnicos
├── trained_models/ # Modelos guardados
│ ├── xgb_high.json
│ └── xgb_low.json
└── environment-cpu.yml # Conda environment
Dependencias
# Principales
- python=3.11
- fastapi=0.115
- uvicorn
- xgboost
- scikit-learn
- pandas
- numpy
- loguru
- aiohttp
- requests
Configuracion
Variables de Entorno
No requiere variables de entorno obligatorias.
Opcionales:
ML_MODEL_PATH=./trained_models
BINANCE_API_KEY=xxx # Opcional para rate limits
Iniciar Servidor
cd apps/ml-services
conda activate trading-ml
uvicorn src.api.server:app --host 0.0.0.0 --port 8000 --reload
Limitaciones
- Simbolos soportados: Solo BTCUSDT y ETHUSDT para training
- Horizonte maximo: 6 horas (72 candles de 5min)
- Rate limits Binance: 1200 requests/min
- Precision: MAE tipico de 0.1% a 2%
Proximas Mejoras
- Modelo GRU para patrones secuenciales
- Ensemble XGBoost + GRU
- Soporte para mas simbolos (XAU, EUR)
- Predicciones a nivel de ticks
- AutoML para optimizacion de hiperparametros
Schemas DDL Asignados
Este modulo es owner del siguiente schema DDL:
| Schema | Tablas | Descripcion |
|---|---|---|
| ml | 12 | models, model_versions, predictions, signals, signal_subscriptions, backtests, backtest_results, feature_sets, training_jobs, ensemble_models, ensemble_predictions, model_metrics |
Total tablas: 12 Nota DDL drift: Documentacion previa no incluia seccion de schemas DDL. Las 12 tablas cubren el ciclo completo de ML: entrenamiento (models, model_versions, training_jobs, feature_sets), prediccion (predictions, signals, signal_subscriptions), evaluacion (backtests, backtest_results, model_metrics) y ensemble (ensemble_models, ensemble_predictions). Actualizado por TASK-2026-02-06 F2.6.
Documentacion generada: 2025-12-05