- FASE-0: Diagnostic audit of 500+ files, 33 findings cataloged (7P0/8P1/12P2/6P3) - FASE-1: Resolved 7 P0 critical conflicts (ports, paths, dedup OQI-010/ADR-002, orphan schemas) - FASE-2: Resolved 8 P1 issues (traces, README/CLAUDE.md, DEPENDENCY-GRAPH v2.0, DDL drift, stack versions, DoR/DoD) - FASE-3: Resolved 12 P2 issues (archived tasks indexed, RNFs created, OQI-010 US/RF/ET, AGENTS v2.0) - FASE-4: Purged 3 obsolete docs to _archive/, fixed MODELO-NEGOCIO.md broken ref - FASE-5: Cross-layer validation (DDL→OQI 66%, OQI→BE 72%, BE→FE 78%, Inventories 95%) - FASE-6: INFORME-FINAL, SA-INDEX (18 subagents), METADATA COMPLETED 27/33 findings resolved (82%), 6 P3 deferred to backlog. 18 new files created, 40+ modified, 4 archived. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
371 lines
9.2 KiB
Markdown
371 lines
9.2 KiB
Markdown
---
|
|
id: "README"
|
|
title: "Senales ML y Predicciones"
|
|
type: "Documentation"
|
|
project: "trading-platform"
|
|
version: "1.0.0"
|
|
updated_date: "2026-02-06"
|
|
---
|
|
|
|
# OQI-006: Senales ML y Predicciones
|
|
|
|
**Estado:** ✅ Implementado
|
|
**Fecha:** 2025-12-05
|
|
**Modulo:** `apps/ml-services`
|
|
|
|
---
|
|
|
|
## Descripcion
|
|
|
|
Sistema de prediccion de precios basado en XGBoost que predice:
|
|
- Precio maximo esperado en horizonte temporal
|
|
- Precio minimo esperado en horizonte temporal
|
|
- Nivel de confianza de la prediccion
|
|
|
|
---
|
|
|
|
## Arquitectura
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────────────────────────────┐
|
|
│ Binance API │────▶│ ML SERVICES (FastAPI) │
|
|
│ (Market Data) │ │ Puerto 8000 │
|
|
└─────────────────┘ │ │
|
|
│ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ MarketData │ │ XGBoost │ │
|
|
│ │ Fetcher │──│ Predictor │ │
|
|
│ └──────────────┘ └──────────────┘ │
|
|
│ │ │ │
|
|
│ ▼ ▼ │
|
|
│ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ Feature │ │ Training │ │
|
|
│ │ Engineering │ │ Pipeline │ │
|
|
│ └──────────────┘ └──────────────┘ │
|
|
└─────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Endpoints
|
|
|
|
| Metodo | Ruta | Descripcion |
|
|
|--------|------|-------------|
|
|
| GET | `/health` | Health check |
|
|
| GET | `/api/stats` | Estado del servicio |
|
|
| GET | `/api/predict/{symbol}` | Predicciones de precio |
|
|
| POST | `/api/train/{symbol}` | Entrenar modelo |
|
|
| GET | `/api/training/status` | Estado del entrenamiento |
|
|
| GET | `/api/signals/{symbol}` | Senales de trading |
|
|
| GET | `/api/indicators/{symbol}` | Indicadores tecnicos |
|
|
| WS | `/ws/{symbol}` | Predicciones en tiempo real |
|
|
|
|
---
|
|
|
|
## Modelo XGBoost
|
|
|
|
### Configuracion
|
|
|
|
```python
|
|
@dataclass
|
|
class ModelConfig:
|
|
n_estimators: int = 100 # Numero de arboles
|
|
max_depth: int = 6 # Profundidad maxima
|
|
learning_rate: float = 0.1 # Tasa de aprendizaje
|
|
subsample: float = 0.8 # Submuestra por arbol
|
|
colsample_bytree: float = 0.8
|
|
min_child_weight: int = 1
|
|
random_state: int = 42
|
|
```
|
|
|
|
### Features (30+)
|
|
|
|
**Volatilidad:**
|
|
- `volatility_5`, `volatility_10`, `volatility_20`, `volatility_50`
|
|
- `atr_5`, `atr_10`, `atr_20`, `atr_50`
|
|
|
|
**Momentum:**
|
|
- `momentum_5`, `momentum_10`, `momentum_20`
|
|
- `roc_5`, `roc_10`, `roc_20`
|
|
|
|
**Medias Moviles:**
|
|
- `sma_5`, `sma_10`, `sma_20`, `sma_50`
|
|
- `ema_5`, `ema_10`, `ema_20`, `ema_50`
|
|
- `sma_ratio_5`, `sma_ratio_10`, `sma_ratio_20`, `sma_ratio_50`
|
|
|
|
**Indicadores:**
|
|
- `rsi_14` - Relative Strength Index
|
|
- `macd`, `macd_signal`, `macd_histogram`
|
|
- `bb_position` - Posicion en Bollinger Bands
|
|
|
|
**Volumen:**
|
|
- `volume_ratio` - Ratio vs SMA 20
|
|
|
|
**High/Low:**
|
|
- `hl_range_pct` - Rango high-low como %
|
|
- `high_distance`, `low_distance`
|
|
- `hist_max_ratio_*`, `hist_min_ratio_*`
|
|
|
|
---
|
|
|
|
## Targets
|
|
|
|
El modelo predice:
|
|
|
|
1. **max_ratio**: Ratio del maximo futuro respecto al precio actual
|
|
```
|
|
max_ratio = future_high / current_price - 1
|
|
```
|
|
|
|
2. **min_ratio**: Ratio del minimo futuro respecto al precio actual
|
|
```
|
|
min_ratio = 1 - future_low / current_price
|
|
```
|
|
|
|
---
|
|
|
|
## Horizontes de Prediccion
|
|
|
|
| Horizonte | Candles (5min) | Tiempo | Uso |
|
|
|-----------|----------------|--------|-----|
|
|
| Scalping | 6 | 30 min | Trading rapido |
|
|
| Intraday | 18 | 90 min | Day trading |
|
|
| Swing | 36 | 3 horas | Swing trading |
|
|
| Position | 72 | 6 horas | Posiciones largas |
|
|
|
|
---
|
|
|
|
## Metricas de Entrenamiento
|
|
|
|
| Metrica | Descripcion | Valor Tipico |
|
|
|---------|-------------|--------------|
|
|
| high_mae | Error absoluto medio (high) | 0.1% - 2% |
|
|
| high_rmse | Error cuadratico medio (high) | 0.15% - 2.5% |
|
|
| low_mae | Error absoluto medio (low) | 0.1% - 2% |
|
|
| low_rmse | Error cuadratico medio (low) | 0.15% - 2.5% |
|
|
|
|
---
|
|
|
|
## Ejemplo de Prediccion
|
|
|
|
### Request
|
|
|
|
```bash
|
|
curl http://localhost:8000/api/predict/BTCUSDT?horizon=all
|
|
```
|
|
|
|
### Response
|
|
|
|
```json
|
|
{
|
|
"symbol": "BTCUSDT",
|
|
"timestamp": "2025-12-05T18:05:08.889327Z",
|
|
"current_price": 89388.99,
|
|
"predictions": {
|
|
"scalping": {
|
|
"high": 89663.86,
|
|
"low": 88930.53,
|
|
"high_ratio": 1.0031,
|
|
"low_ratio": 0.9949,
|
|
"confidence": 0.69,
|
|
"minutes": 30
|
|
},
|
|
"intraday": {
|
|
"high": 90213.60,
|
|
"low": 88013.61,
|
|
"high_ratio": 1.0093,
|
|
"low_ratio": 0.9848,
|
|
"confidence": 0.59,
|
|
"minutes": 90
|
|
},
|
|
"swing": {
|
|
"high": 91038.21,
|
|
"low": 86638.23,
|
|
"high_ratio": 1.0187,
|
|
"low_ratio": 0.9698,
|
|
"confidence": 0.45,
|
|
"minutes": 180
|
|
},
|
|
"position": {
|
|
"high": 92687.43,
|
|
"low": 83887.47,
|
|
"high_ratio": 1.0378,
|
|
"low_ratio": 0.9405,
|
|
"confidence": 0.45,
|
|
"minutes": 360
|
|
}
|
|
},
|
|
"model_version": "1.0.0",
|
|
"is_trained": true
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Entrenamiento
|
|
|
|
### Iniciar Entrenamiento
|
|
|
|
```bash
|
|
curl -X POST "http://localhost:8000/api/train/BTCUSDT?samples=500"
|
|
```
|
|
|
|
### Respuesta
|
|
|
|
```json
|
|
{
|
|
"status": "training_started",
|
|
"symbol": "BTCUSDT",
|
|
"samples": 500,
|
|
"message": "Model training started in background. Check /api/stats for progress."
|
|
}
|
|
```
|
|
|
|
### Verificar Estado
|
|
|
|
```bash
|
|
curl http://localhost:8000/api/training/status
|
|
```
|
|
|
|
```json
|
|
{
|
|
"training_in_progress": false,
|
|
"is_trained": true,
|
|
"last_training": {
|
|
"symbol": "BTCUSDT",
|
|
"timestamp": "2025-12-05T18:04:49.757994",
|
|
"samples": 500,
|
|
"metrics": {
|
|
"high_mae": 0.00099,
|
|
"high_rmse": 0.00141,
|
|
"low_mae": 0.00173,
|
|
"low_rmse": 0.00284,
|
|
"train_samples": 355,
|
|
"test_samples": 89
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Market Data
|
|
|
|
### Fuentes de Datos
|
|
|
|
| Fuente | Uso | API |
|
|
|--------|-----|-----|
|
|
| Binance | Crypto (BTC, ETH) | REST + WebSocket |
|
|
| Mock Data | Testing | Generado localmente |
|
|
|
|
### OHLCV Structure
|
|
|
|
```python
|
|
@dataclass
|
|
class OHLCV:
|
|
timestamp: np.ndarray # Epoch milliseconds
|
|
open: np.ndarray # Precio apertura
|
|
high: np.ndarray # Precio maximo
|
|
low: np.ndarray # Precio minimo
|
|
close: np.ndarray # Precio cierre
|
|
volume: np.ndarray # Volumen
|
|
```
|
|
|
|
---
|
|
|
|
## Archivos
|
|
|
|
```
|
|
apps/ml-services/
|
|
├── src/
|
|
│ ├── api/
|
|
│ │ ├── server.py # FastAPI app
|
|
│ │ └── schemas/
|
|
│ │ └── prediction.py # Pydantic schemas
|
|
│ ├── data/
|
|
│ │ └── market_data.py # MarketDataFetcher
|
|
│ └── models/
|
|
│ ├── xgboost_model.py # XGBoostPredictor
|
|
│ ├── predictor.py # MaxMinPricePredictor
|
|
│ └── indicators.py # Indicadores tecnicos
|
|
├── trained_models/ # Modelos guardados
|
|
│ ├── xgb_high.json
|
|
│ └── xgb_low.json
|
|
└── environment-cpu.yml # Conda environment
|
|
```
|
|
|
|
---
|
|
|
|
## Dependencias
|
|
|
|
```yaml
|
|
# Principales
|
|
- python=3.11
|
|
- fastapi=0.115
|
|
- uvicorn
|
|
- xgboost
|
|
- scikit-learn
|
|
- pandas
|
|
- numpy
|
|
- loguru
|
|
- aiohttp
|
|
- requests
|
|
```
|
|
|
|
---
|
|
|
|
## Configuracion
|
|
|
|
### Variables de Entorno
|
|
|
|
No requiere variables de entorno obligatorias.
|
|
|
|
Opcionales:
|
|
```env
|
|
ML_MODEL_PATH=./trained_models
|
|
BINANCE_API_KEY=xxx # Opcional para rate limits
|
|
```
|
|
|
|
### Iniciar Servidor
|
|
|
|
```bash
|
|
cd apps/ml-services
|
|
conda activate trading-ml
|
|
uvicorn src.api.server:app --host 0.0.0.0 --port 8000 --reload
|
|
```
|
|
|
|
---
|
|
|
|
## Limitaciones
|
|
|
|
1. **Simbolos soportados:** Solo BTCUSDT y ETHUSDT para training
|
|
2. **Horizonte maximo:** 6 horas (72 candles de 5min)
|
|
3. **Rate limits Binance:** 1200 requests/min
|
|
4. **Precision:** MAE tipico de 0.1% a 2%
|
|
|
|
---
|
|
|
|
## Proximas Mejoras
|
|
|
|
- [ ] Modelo GRU para patrones secuenciales
|
|
- [ ] Ensemble XGBoost + GRU
|
|
- [ ] Soporte para mas simbolos (XAU, EUR)
|
|
- [ ] Predicciones a nivel de ticks
|
|
- [ ] AutoML para optimizacion de hiperparametros
|
|
|
|
---
|
|
|
|
## Schemas DDL Asignados
|
|
|
|
Este modulo es owner del siguiente schema DDL:
|
|
|
|
| Schema | Tablas | Descripcion |
|
|
|--------|--------|-------------|
|
|
| **ml** | 12 | models, model_versions, predictions, signals, signal_subscriptions, backtests, backtest_results, feature_sets, training_jobs, ensemble_models, ensemble_predictions, model_metrics |
|
|
|
|
**Total tablas:** 12
|
|
**Nota DDL drift:** Documentacion previa no incluia seccion de schemas DDL. Las 12 tablas cubren el ciclo completo de ML: entrenamiento (models, model_versions, training_jobs, feature_sets), prediccion (predictions, signals, signal_subscriptions), evaluacion (backtests, backtest_results, model_metrics) y ensemble (ensemble_models, ensemble_predictions). Actualizado por TASK-2026-02-06 F2.6.
|
|
|
|
---
|
|
|
|
*Documentacion generada: 2025-12-05*
|