trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/README.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

358 lines
8.5 KiB
Markdown

---
id: "README"
title: "Senales ML y Predicciones"
type: "Documentation"
project: "trading-platform"
version: "1.0.0"
updated_date: "2026-01-04"
---
# OQI-006: Senales ML y Predicciones
**Estado:** ✅ Implementado
**Fecha:** 2025-12-05
**Modulo:** `apps/ml-services`
---
## Descripcion
Sistema de prediccion de precios basado en XGBoost que predice:
- Precio maximo esperado en horizonte temporal
- Precio minimo esperado en horizonte temporal
- Nivel de confianza de la prediccion
---
## Arquitectura
```
┌─────────────────┐ ┌─────────────────────────────────────────┐
│ Binance API │────▶│ ML SERVICES (FastAPI) │
│ (Market Data) │ │ Puerto 8000 │
└─────────────────┘ │ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ MarketData │ │ XGBoost │ │
│ │ Fetcher │──│ Predictor │ │
│ └──────────────┘ └──────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Feature │ │ Training │ │
│ │ Engineering │ │ Pipeline │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────┘
```
---
## Endpoints
| Metodo | Ruta | Descripcion |
|--------|------|-------------|
| GET | `/health` | Health check |
| GET | `/api/stats` | Estado del servicio |
| GET | `/api/predict/{symbol}` | Predicciones de precio |
| POST | `/api/train/{symbol}` | Entrenar modelo |
| GET | `/api/training/status` | Estado del entrenamiento |
| GET | `/api/signals/{symbol}` | Senales de trading |
| GET | `/api/indicators/{symbol}` | Indicadores tecnicos |
| WS | `/ws/{symbol}` | Predicciones en tiempo real |
---
## Modelo XGBoost
### Configuracion
```python
@dataclass
class ModelConfig:
n_estimators: int = 100 # Numero de arboles
max_depth: int = 6 # Profundidad maxima
learning_rate: float = 0.1 # Tasa de aprendizaje
subsample: float = 0.8 # Submuestra por arbol
colsample_bytree: float = 0.8
min_child_weight: int = 1
random_state: int = 42
```
### Features (30+)
**Volatilidad:**
- `volatility_5`, `volatility_10`, `volatility_20`, `volatility_50`
- `atr_5`, `atr_10`, `atr_20`, `atr_50`
**Momentum:**
- `momentum_5`, `momentum_10`, `momentum_20`
- `roc_5`, `roc_10`, `roc_20`
**Medias Moviles:**
- `sma_5`, `sma_10`, `sma_20`, `sma_50`
- `ema_5`, `ema_10`, `ema_20`, `ema_50`
- `sma_ratio_5`, `sma_ratio_10`, `sma_ratio_20`, `sma_ratio_50`
**Indicadores:**
- `rsi_14` - Relative Strength Index
- `macd`, `macd_signal`, `macd_histogram`
- `bb_position` - Posicion en Bollinger Bands
**Volumen:**
- `volume_ratio` - Ratio vs SMA 20
**High/Low:**
- `hl_range_pct` - Rango high-low como %
- `high_distance`, `low_distance`
- `hist_max_ratio_*`, `hist_min_ratio_*`
---
## Targets
El modelo predice:
1. **max_ratio**: Ratio del maximo futuro respecto al precio actual
```
max_ratio = future_high / current_price - 1
```
2. **min_ratio**: Ratio del minimo futuro respecto al precio actual
```
min_ratio = 1 - future_low / current_price
```
---
## Horizontes de Prediccion
| Horizonte | Candles (5min) | Tiempo | Uso |
|-----------|----------------|--------|-----|
| Scalping | 6 | 30 min | Trading rapido |
| Intraday | 18 | 90 min | Day trading |
| Swing | 36 | 3 horas | Swing trading |
| Position | 72 | 6 horas | Posiciones largas |
---
## Metricas de Entrenamiento
| Metrica | Descripcion | Valor Tipico |
|---------|-------------|--------------|
| high_mae | Error absoluto medio (high) | 0.1% - 2% |
| high_rmse | Error cuadratico medio (high) | 0.15% - 2.5% |
| low_mae | Error absoluto medio (low) | 0.1% - 2% |
| low_rmse | Error cuadratico medio (low) | 0.15% - 2.5% |
---
## Ejemplo de Prediccion
### Request
```bash
curl http://localhost:8000/api/predict/BTCUSDT?horizon=all
```
### Response
```json
{
"symbol": "BTCUSDT",
"timestamp": "2025-12-05T18:05:08.889327Z",
"current_price": 89388.99,
"predictions": {
"scalping": {
"high": 89663.86,
"low": 88930.53,
"high_ratio": 1.0031,
"low_ratio": 0.9949,
"confidence": 0.69,
"minutes": 30
},
"intraday": {
"high": 90213.60,
"low": 88013.61,
"high_ratio": 1.0093,
"low_ratio": 0.9848,
"confidence": 0.59,
"minutes": 90
},
"swing": {
"high": 91038.21,
"low": 86638.23,
"high_ratio": 1.0187,
"low_ratio": 0.9698,
"confidence": 0.45,
"minutes": 180
},
"position": {
"high": 92687.43,
"low": 83887.47,
"high_ratio": 1.0378,
"low_ratio": 0.9405,
"confidence": 0.45,
"minutes": 360
}
},
"model_version": "1.0.0",
"is_trained": true
}
```
---
## Entrenamiento
### Iniciar Entrenamiento
```bash
curl -X POST "http://localhost:8000/api/train/BTCUSDT?samples=500"
```
### Respuesta
```json
{
"status": "training_started",
"symbol": "BTCUSDT",
"samples": 500,
"message": "Model training started in background. Check /api/stats for progress."
}
```
### Verificar Estado
```bash
curl http://localhost:8000/api/training/status
```
```json
{
"training_in_progress": false,
"is_trained": true,
"last_training": {
"symbol": "BTCUSDT",
"timestamp": "2025-12-05T18:04:49.757994",
"samples": 500,
"metrics": {
"high_mae": 0.00099,
"high_rmse": 0.00141,
"low_mae": 0.00173,
"low_rmse": 0.00284,
"train_samples": 355,
"test_samples": 89
}
}
}
```
---
## Market Data
### Fuentes de Datos
| Fuente | Uso | API |
|--------|-----|-----|
| Binance | Crypto (BTC, ETH) | REST + WebSocket |
| Mock Data | Testing | Generado localmente |
### OHLCV Structure
```python
@dataclass
class OHLCV:
timestamp: np.ndarray # Epoch milliseconds
open: np.ndarray # Precio apertura
high: np.ndarray # Precio maximo
low: np.ndarray # Precio minimo
close: np.ndarray # Precio cierre
volume: np.ndarray # Volumen
```
---
## Archivos
```
apps/ml-services/
├── src/
│ ├── api/
│ │ ├── server.py # FastAPI app
│ │ └── schemas/
│ │ └── prediction.py # Pydantic schemas
│ ├── data/
│ │ └── market_data.py # MarketDataFetcher
│ └── models/
│ ├── xgboost_model.py # XGBoostPredictor
│ ├── predictor.py # MaxMinPricePredictor
│ └── indicators.py # Indicadores tecnicos
├── trained_models/ # Modelos guardados
│ ├── xgb_high.json
│ └── xgb_low.json
└── environment-cpu.yml # Conda environment
```
---
## Dependencias
```yaml
# Principales
- python=3.11
- fastapi=0.115
- uvicorn
- xgboost
- scikit-learn
- pandas
- numpy
- loguru
- aiohttp
- requests
```
---
## Configuracion
### Variables de Entorno
No requiere variables de entorno obligatorias.
Opcionales:
```env
ML_MODEL_PATH=./trained_models
BINANCE_API_KEY=xxx # Opcional para rate limits
```
### Iniciar Servidor
```bash
cd apps/ml-services
conda activate trading-ml
uvicorn src.api.server:app --host 0.0.0.0 --port 8000 --reload
```
---
## Limitaciones
1. **Simbolos soportados:** Solo BTCUSDT y ETHUSDT para training
2. **Horizonte maximo:** 6 horas (72 candles de 5min)
3. **Rate limits Binance:** 1200 requests/min
4. **Precision:** MAE tipico de 0.1% a 2%
---
## Proximas Mejoras
- [ ] Modelo GRU para patrones secuenciales
- [ ] Ensemble XGBoost + GRU
- [ ] Soporte para mas simbolos (XAU, EUR)
- [ ] Predicciones a nivel de ticks
- [ ] AutoML para optimizacion de hiperparametros
---
*Documentacion generada: 2025-12-05*