trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/especificaciones/ET-ML-009-multi-strategy-ensemble.md
Adrian Flores Cortes 76b0ced338 [TASK-002] docs: Auditoria comprehensiva frontend trading-platform
Analisis exhaustivo CAPVED de 9 epics (OQI-001 a OQI-009) con:
- 48 documentos generados (~19,000 lineas)
- 122+ componentes analizados
- 113 endpoints API mapeados
- 30 gaps criticos identificados
- Roadmap de implementacion (2,457h esfuerzo)
- 9 subagentes en paralelo (2.5-3h vs 20h)

Hallazgos principales:
- 38% completitud promedio
- 10 gaps bloqueantes (P0)
- OQI-009 (MT4) 0% funcional
- OQI-005 (Pagos) PCI-DSS non-compliant
- Test coverage <10%

Entregables:
- EXECUTIVE-SUMMARY.md (reporte ejecutivo)
- 02-ANALISIS.md (consolidado 9 epics)
- 48 docs tecnicos por epic (componentes, APIs, gaps)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 12:57:14 -06:00

446 lines
16 KiB
Markdown

# ET-ML-009: Arquitectura Multi-Strategy Ensemble
**Versión:** 1.0.0
**Fecha:** 2026-01-25
**Estado:** PLANIFICADO
**Prioridad:** P0
**Tarea Relacionada:** TASK-2026-01-25-ML-TRAINING-ENHANCEMENT
---
## 1. RESUMEN EJECUTIVO
Esta especificación técnica define la arquitectura de un sistema multi-estrategia con Neural Gating Metamodel para predicciones de trading. El objetivo es alcanzar **80% de efectividad** en operaciones mediante la combinación de 5 estrategias diversificadas.
---
## 2. ARQUITECTURA GENERAL
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ MULTI-STRATEGY ENSEMBLE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ [Market Data] → [Feature Engineering] → [5 Strategies] → [Metamodel] → [LLM] │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ LEVEL 0: Attention Base │ │
│ │ Price-Focused Attention (sin información temporal) │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ LEVEL 1: Strategy Models (5) │ │
│ │ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ │ │
│ │ │ PVA │ │ MRD │ │ VBP │ │ MSA │ │ MTS │ │ │
│ │ └───┬───┘ └───┬───┘ └───┬───┘ └───┬───┘ └───┬───┘ │ │
│ └──────┼─────────┼─────────┼─────────┼─────────┼────────────────────────┘ │
│ └─────────┴─────────┴─────────┴─────────┘ │
│ ↓ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ LEVEL 2: Neural Gating Metamodel │ │
│ │ - Weighted Ensemble │ │
│ │ - Confidence Calibration │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ LEVEL 3: LLM Decision Integration │ │
│ │ - Signal Formatting │ │
│ │ - Trading Decision (TRADE/NO_TRADE/WAIT) │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
```
---
## 3. ESTRATEGIAS DETALLADAS
### 3.1 Strategy 1: PVA (Price Variation Attention)
**Objetivo:** Capturar patrones en variación de precio pura, agnóstica al tiempo.
| Aspecto | Especificación |
|---------|----------------|
| **Arquitectura** | Transformer Encoder (4 layers) + XGBoost Head |
| **Input** | Secuencia de retornos [r_1, r_2, ..., r_T], T=100 |
| **Features** | returns (1,5,10,20), acceleration, volatility_returns, skewness, kurtosis |
| **Attention** | Multi-head (8 heads), d_model=256, d_k=32 |
| **Output** | direction [-1,0,+1], magnitude (%), confidence (0-1) |
| **Entrenamiento** | Walk-forward, batch=256, lr=0.0001, epochs=100 |
| **Modelos** | 6 (uno por activo) |
**Archivos:**
- `src/models/strategies/pva/transformer_encoder.py`
- `src/models/strategies/pva/xgb_head.py`
- `src/features/returns_features.py`
### 3.2 Strategy 2: MRD (Momentum Regime Detection)
**Objetivo:** Detectar régimen de mercado (tendencia/rango) y predecir continuación.
| Aspecto | Especificación |
|---------|----------------|
| **Arquitectura** | HMM (3 estados) + LSTM (128 units) + XGBoost |
| **Input** | Indicadores de momentum + estado HMM |
| **Features** | RSI, MACD, ROC, ADX, +DI, -DI, EMA crossovers |
| **HMM States** | [Trend Up, Range, Trend Down] |
| **Output** | next_regime, regime_duration, momentum_continuation |
| **Entrenamiento** | HMM fit + LSTM supervised + XGBoost ensemble |
| **Modelos** | 6 (uno por activo) |
**Archivos:**
- `src/models/strategies/mrd/hmm_regime.py`
- `src/models/strategies/mrd/lstm_predictor.py`
- `src/features/momentum_features.py`
### 3.3 Strategy 3: VBP (Volatility Breakout Predictor)
**Objetivo:** Predecir breakouts basados en compresión de volatilidad.
| Aspecto | Especificación |
|---------|----------------|
| **Arquitectura** | CNN 1D (filters [32,64,128]) + Attention + XGBoost |
| **Input** | Features de volatilidad y compresión |
| **Features** | ATR, BB_width, BB_squeeze, Keltner_squeeze, compression_score |
| **Output** | breakout_next_12 (bool), direction, magnitude |
| **Entrenamiento** | Imbalanced sampling (oversample breakouts 3x) |
| **Modelos** | 6 (uno por activo) |
**Archivos:**
- `src/models/strategies/vbp/cnn_encoder.py`
- `src/models/strategies/vbp/breakout_classifier.py`
- `src/features/volatility_features.py`
### 3.4 Strategy 4: MSA (Market Structure Analysis)
**Objetivo:** Análisis de estructura de mercado (ICT/SMC concepts).
| Aspecto | Especificación |
|---------|----------------|
| **Arquitectura** | XGBoost (con opción de GNN para relaciones) |
| **Input** | Swing points, BOS, CHoCH, FVG, Order Blocks |
| **Features** | swing_high/low, higher_high/lower_low, bos_up/down, choch, fvg, ob |
| **Output** | next_bos_direction, poi_reaction, structure_continuation |
| **Entrenamiento** | Supervised con labels de estructura |
| **Modelos** | 6 (uno por activo) |
**Archivos:**
- `src/models/strategies/msa/structure_detector.py`
- `src/models/strategies/msa/structure_predictor.py`
- `src/features/structure_features.py`
### 3.5 Strategy 5: MTS (Multi-Timeframe Synthesis)
**Objetivo:** Síntesis de señales de múltiples timeframes.
| Aspecto | Especificación |
|---------|----------------|
| **Arquitectura** | Hierarchical Attention Network |
| **Input** | Features agregados de 5m, 15m, 1h, 4h |
| **Features** | All base features + tf_alignment_score + conflict_score |
| **Output** | unified_direction, confidence_by_alignment, optimal_entry_tf |
| **Entrenamiento** | Hierarchical loss con TF weights aprendibles |
| **Modelos** | 6 (uno por activo) |
**Archivos:**
- `src/models/strategies/mts/hierarchical_attention.py`
- `src/models/strategies/mts/tf_aggregator.py`
- `src/features/multitf_features.py`
---
## 4. NEURAL GATING METAMODEL
### 4.1 Arquitectura
```python
class NeuralGatingMetamodel(nn.Module):
def __init__(self, n_strategies=5, d_input=10, d_hidden=256):
super().__init__()
self.gate_network = nn.Sequential(
nn.Linear(n_strategies * d_input + d_context, d_hidden),
nn.BatchNorm1d(d_hidden),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(d_hidden, d_hidden // 2),
nn.ReLU(),
nn.Linear(d_hidden // 2, n_strategies)
)
self.confidence_head = nn.Sequential(
nn.Linear(d_hidden // 2, 1),
nn.Sigmoid()
)
def forward(self, strategy_outputs, market_context):
# Concatenar outputs de estrategias + contexto
features = torch.cat([
*[s['pred_conf'] for s in strategy_outputs.values()],
market_context
], dim=-1)
# Gating network
gate_logits = self.gate_network(features)
weights = F.softmax(gate_logits, dim=-1)
# Weighted ensemble
predictions = torch.stack([
s['prediction'] for s in strategy_outputs.values()
], dim=-1)
final_pred = (weights * predictions).sum(dim=-1)
return {
'prediction': final_pred,
'weights': weights,
'confidence': self.confidence_head(features[:, :256])
}
```
### 4.2 Entrenamiento del Gating
- **Loss:** Weighted cross-entropy + KL divergence para regularización
- **Regularización:** Entropy bonus para evitar colapso a una estrategia
- **Datos:** Predicciones de las 5 estrategias en validation set
---
## 5. INTEGRACIÓN LLM
### 5.1 Signal Formatter
```python
def format_signal_for_llm(metamodel_output, strategy_outputs, market_context):
return f"""
## ML Prediction Summary for {market_context['symbol']}
### Ensemble Prediction
- Direction: {metamodel_output['direction']} (confidence: {metamodel_output['confidence']:.1%})
- Predicted Move: {metamodel_output['magnitude']:.2%}
- Strategy Agreement: {calculate_agreement(strategy_outputs)}/5
### Individual Strategies
| Strategy | Direction | Confidence | Weight |
|----------|-----------|------------|--------|
{format_strategy_table(strategy_outputs, metamodel_output['weights'])}
### Market Context
- Volatility Regime: {market_context['volatility_regime']}
- Current Trend: {market_context['trend']}
- Key Levels: {market_context['key_levels']}
"""
```
### 5.2 Decision Parsing
```python
def parse_llm_decision(response: str) -> TradingDecision:
"""Parse LLM response into structured trading decision."""
lines = response.strip().split('\n')
decision = {}
for line in lines:
if line.startswith('DECISION:'):
decision['action'] = line.split(':')[1].strip()
elif line.startswith('ENTRY:'):
decision['entry'] = float(line.split(':')[1].strip())
elif line.startswith('STOP_LOSS:'):
decision['stop_loss'] = float(line.split(':')[1].strip())
elif line.startswith('TAKE_PROFIT:'):
decision['take_profit'] = parse_tp_levels(line.split(':')[1])
elif line.startswith('POSITION_SIZE:'):
decision['position_size'] = float(line.split(':')[1].strip().rstrip('%'))
return TradingDecision(**decision)
```
---
## 6. MÉTRICAS Y VALIDACIÓN
### 6.1 Métricas por Estrategia
| Métrica | Objetivo | Medición |
|---------|----------|----------|
| Direction Accuracy | ≥60% | Validation set |
| MAE (Magnitude) | ≤1% | Validation set |
| Sharpe Ratio | ≥1.0 | Backtesting |
| Max Drawdown | ≤20% | Backtesting |
### 6.2 Métricas del Ensemble
| Métrica | Objetivo | Medición |
|---------|----------|----------|
| Efectividad Operaciones | ≥80% | Backtesting con LLM |
| Sharpe Ratio | ≥1.5 | Backtesting |
| Max Drawdown | ≤15% | Backtesting |
| Calibration Error | ≤5% | Reliability diagram |
### 6.3 Backtesting Protocol
1. **Walk-Forward Validation:** 80% train, 10% val, 10% test
2. **Out-of-Sample Period:** Últimos 6 meses
3. **Slippage:** 0.5 pips para forex, 0.1% para crypto
4. **Commission:** 0.1% por trade
5. **Position Sizing:** Kelly criterion con max 2%
---
## 7. CONFIGURACIÓN
### 7.1 Config por Estrategia
```yaml
# config/strategies/pva.yaml
strategy_name: "PVA"
architecture:
encoder:
type: "transformer"
n_layers: 4
n_heads: 8
d_model: 256
d_ff: 1024
dropout: 0.1
head:
type: "xgboost"
n_estimators: 500
max_depth: 6
learning_rate: 0.05
features:
returns_periods: [1, 5, 10, 20]
include_derivatives: true
sequence_length: 100
training:
batch_size: 256
learning_rate: 0.0001
epochs: 100
early_stopping_patience: 10
walk_forward_folds: 5
```
### 7.2 Config del Metamodel
```yaml
# config/metamodel.yaml
metamodel_name: "NeuralGating"
architecture:
n_strategies: 5
d_hidden: 256
dropout: 0.3
training:
batch_size: 128
learning_rate: 0.001
epochs: 50
entropy_regularization: 0.01
calibration:
method: "isotonic"
cv_folds: 5
```
---
## 8. ENDPOINTS API
### 8.1 Prediction Endpoint
```
POST /api/ml/predict/ensemble
Request:
{
"symbol": "XAUUSD",
"timeframe": "5m",
"include_strategies": true
}
Response:
{
"prediction": {
"direction": 1,
"magnitude": 0.0032,
"confidence": 0.78
},
"weights": {
"PVA": 0.25,
"MRD": 0.20,
"VBP": 0.15,
"MSA": 0.25,
"MTS": 0.15
},
"strategies": {
"PVA": {"direction": 1, "confidence": 0.72},
"MRD": {"direction": 1, "confidence": 0.65},
"VBP": {"direction": 0, "confidence": 0.55},
"MSA": {"direction": 1, "confidence": 0.80},
"MTS": {"direction": 1, "confidence": 0.70}
},
"timestamp": "2026-01-25T12:00:00Z"
}
```
### 8.2 Strategy Detail Endpoint
```
GET /api/ml/strategy/{strategy_name}/{symbol}
Response:
{
"strategy": "PVA",
"symbol": "XAUUSD",
"prediction": {...},
"attention_scores": [...],
"feature_importance": {...},
"model_version": "1.0.0",
"last_trained": "2026-01-25"
}
```
---
## 9. DEPENDENCIAS
### 9.1 Python Packages
```
torch>=2.0.0
xgboost>=2.0.0
hmmlearn>=0.3.0
scikit-learn>=1.3.0
pandas>=2.0.0
numpy>=1.24.0
einops>=0.7.0
```
### 9.2 Infraestructura
- GPU: NVIDIA con 16GB+ VRAM
- PostgreSQL: Para almacenar modelos y predicciones
- Redis: Para caché de predicciones
---
## 10. ROADMAP DE IMPLEMENTACIÓN
| Fase | Descripción | Dependencia |
|------|-------------|-------------|
| 1 | Data Pipeline + Attention Architecture | - |
| 2a | Strategy PVA | Fase 1 |
| 2b | Strategy MRD | Fase 1 |
| 2c | Strategy VBP | Fase 1 |
| 2d | Strategy MSA | Fase 1 |
| 2e | Strategy MTS | Fase 1 |
| 3a | Neural Gating Metamodel | Fase 2 (todas) |
| 3b | LLM Integration | Fase 3a |
| 4 | Backtesting Validation | Fase 3b |
---
## 11. REFERENCIAS
- **Tarea:** TASK-2026-01-25-ML-TRAINING-ENHANCEMENT
- **Documentación existente:** ET-ML-001 a ET-ML-008
- **Proyecto antiguo:** C:\Empresas\WorkspaceOld\Projects\trading
---
**Estado:** PLANIFICADO
**Próximo paso:** Aprobación del plan e inicio de implementación