974 lines
48 KiB
Markdown
974 lines
48 KiB
Markdown
# Arquitectura de Modelos ML y Flujo de Datos
|
|
|
|
**Version:** 2.0.0
|
|
**Fecha:** 2025-12-08
|
|
**Modulo:** OQI-006-ml-signals
|
|
**Autor:** Trading Strategist - OrbiQuant IA
|
|
|
|
---
|
|
|
|
## Tabla de Contenidos
|
|
|
|
1. [Vision General del Sistema](#vision-general-del-sistema)
|
|
2. [Pipeline de Modelos por Niveles](#pipeline-de-modelos-por-niveles)
|
|
3. [Dependencias entre Modelos](#dependencias-entre-modelos)
|
|
4. [Flujo de Datos Detallado](#flujo-de-datos-detallado)
|
|
5. [Meta-Modelo y Ensemble](#meta-modelo-y-ensemble)
|
|
6. [Pipeline de Entrenamiento](#pipeline-de-entrenamiento)
|
|
7. [Pipeline de Inferencia](#pipeline-de-inferencia)
|
|
8. [Integracion con Agente LLM](#integracion-con-agente-llm)
|
|
|
|
---
|
|
|
|
## Vision General del Sistema
|
|
|
|
### Arquitectura de Alto Nivel
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ ORBIQUANT IA ML SYSTEM v2.0 │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
┌──────────────┐ │ NIVEL 0: DATA INGESTION │
|
|
│ │ │ ┌─────────────────────────────────────────────────────────┐ │
|
|
│ MARKET │────────▶│ │ Market Data (OHLCV) + Timestamps │ │
|
|
│ DATA │ │ │ - API Massive (historical) │ │
|
|
│ │ │ │ - MetaTrader4 (live) │ │
|
|
└──────────────┘ │ │ - 10 anos de datos: XAUUSD, EURUSD, GBPUSD, USDJPY │ │
|
|
│ └──────────────────────────┬──────────────────────────────┘ │
|
|
│ │ │
|
|
│ NIVEL 1: FEATURE ENGINEERING │
|
|
│ ┌──────────────────────────▼──────────────────────────────┐ │
|
|
│ │ Feature Engineering Pipeline │ │
|
|
│ │ ┌──────────┬──────────┬──────────┬──────────┬───────┐ │ │
|
|
│ │ │ Price │ Volume │Volatility│ Trend │ Time │ │ │
|
|
│ │ │ Action │ │ │ │ │ │ │
|
|
│ │ │ (12) │ (10) │ (8) │ (10) │ (6) │ │ │
|
|
│ │ └────┬─────┴────┬─────┴────┬─────┴────┬─────┴───┬───┘ │ │
|
|
│ │ │ │ │ │ │ │ │
|
|
│ │ ┌────┴────┬─────┴────┬─────┴────┬─────┴────┐ │ │ │
|
|
│ │ │Structure│Order Flow│Liquidity │ ICT │ │ │ │
|
|
│ │ │ (12) │ (10) │ (8) │ (15) │ │ │ │
|
|
│ │ └────┬────┴────┬─────┴────┬─────┴────┬─────┘ │ │ │
|
|
│ │ │ │ │ │ │ │ │
|
|
│ │ │ ┌────┴──────────┴──────────┴──────────┘ │ │
|
|
│ │ │ │ SMC (12) │ │
|
|
│ │ │ └────┬───────────────────────────────────────┤ │
|
|
│ │ │ │ │ │
|
|
│ │ ┌────▼─────────▼───────────────────────────────────┐ │ │
|
|
│ │ │ COMBINED FEATURE VECTOR (103) │ │ │
|
|
│ │ └──────────────────────────┬───────────────────────┘ │ │
|
|
│ └──────────────────────────────┼──────────────────────────┘ │
|
|
│ │ │
|
|
│ NIVEL 2: MODELOS PRIMARIOS (Paralelo) │
|
|
│ ┌──────────────────────────────▼──────────────────────────┐ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────────┐ ┌─────────────────┐ │ │
|
|
│ │ │ AMDDetector │ │ LiquidityHunter │ │ │
|
|
│ │ │ (XGBoost) │ │ (XGBoost) │ │ │
|
|
│ │ │ │ │ │ │ │
|
|
│ │ │ Input: 50 feat │ │ Input: 30 feat │ │ │
|
|
│ │ │ Output: 4 prob │ │ Output: 2 prob │ │ │
|
|
│ │ └────────┬────────┘ └────────┬────────┘ │ │
|
|
│ │ │ │ │ │
|
|
│ │ ┌────────┴────────┐ ┌────────┴────────┐ │ │
|
|
│ │ │ ICTContextModel │ │ OrderFlowModel │ │ │
|
|
│ │ │ (Rules-based) │ │ (LSTM/Optional) │ │ │
|
|
│ │ │ │ │ │ │ │
|
|
│ │ │ Input: ICT feat │ │ Input: Sequence │ │ │
|
|
│ │ │ Output: score │ │ Output: score │ │ │
|
|
│ │ └────────┬────────┘ └────────┬────────┘ │ │
|
|
│ │ │ │ │ │
|
|
│ └───────────┼────────────────────┼────────────────────────┘ │
|
|
│ │ │ │
|
|
│ └──────────┬─────────┘ │
|
|
│ │ │
|
|
│ NIVEL 3: MODELOS SECUNDARIOS (Stacking) │
|
|
│ ┌──────────────────────▼──────────────────────────────────┐ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────────────────────────────────────────────┐│ │
|
|
│ │ │ FEATURE AUGMENTATION ││ │
|
|
│ │ │ Base Features (103) + Level 2 Outputs (12) ││ │
|
|
│ │ │ = 115 features total ││ │
|
|
│ │ └───────────────────────┬─────────────────────────────┘│ │
|
|
│ │ │ │ │
|
|
│ │ ┌───────────────┴───────────────┐ │ │
|
|
│ │ │ │ │ │
|
|
│ │ ┌───────▼───────┐ ┌─────────▼─────────┐ │ │
|
|
│ │ │RangePredictor │ │ TPSLClassifier │ │ │
|
|
│ │ │ (XGBoost) │ │ (XGBoost) │ │ │
|
|
│ │ │ │ │ │ │ │
|
|
│ │ │ Input: 115 │──────┐ │ Input: 115 + Range│ │ │
|
|
│ │ │ Output: │ │ │ Output: P(TP) │ │ │
|
|
│ │ │ delta_high │ └─────▶│ per R:R config │ │ │
|
|
│ │ │ delta_low │ │ │ │ │
|
|
│ │ │ (4 horizons) │ │ │ │ │
|
|
│ │ └───────┬───────┘ └─────────┬─────────┘ │ │
|
|
│ │ │ │ │ │
|
|
│ └──────────┼───────────────────────────────┼──────────────┘ │
|
|
│ │ │ │
|
|
│ └───────────────┬───────────────┘ │
|
|
│ │ │
|
|
│ NIVEL 4: META-MODELO (Ensemble) │
|
|
│ ┌──────────────────────────▼──────────────────────────────┐ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────────────────────────────────────────────┐│ │
|
|
│ │ │ StrategyOrchestrator ││ │
|
|
│ │ │ ││ │
|
|
│ │ │ ┌─────────────────────────────────────────────────┐││ │
|
|
│ │ │ │ INPUTS: │││ │
|
|
│ │ │ │ - AMD phase_proba[4] + confidence │││ │
|
|
│ │ │ │ - Liquidity sweep_proba[2] │││ │
|
|
│ │ │ │ - ICT context_score │││ │
|
|
│ │ │ │ - Range delta_high, delta_low (per horizon) │││ │
|
|
│ │ │ │ - TPSL prob_tp_first (per R:R) │││ │
|
|
│ │ │ └─────────────────────────────────────────────────┘││ │
|
|
│ │ │ ││ │
|
|
│ │ │ ┌─────────────────────────────────────────────────┐││ │
|
|
│ │ │ │ DECISION LOGIC: │││ │
|
|
│ │ │ │ 1. AMD phase filter (acc/dist only) │││ │
|
|
│ │ │ │ 2. ICT context alignment │││ │
|
|
│ │ │ │ 3. Range prediction bias │││ │
|
|
│ │ │ │ 4. TPSL probability threshold │││ │
|
|
│ │ │ │ 5. Liquidity risk check │││ │
|
|
│ │ │ │ 6. Confidence aggregation │││ │
|
|
│ │ │ └─────────────────────────────────────────────────┘││ │
|
|
│ │ │ ││ │
|
|
│ │ │ ┌─────────────────────────────────────────────────┐││ │
|
|
│ │ │ │ OUTPUT: │││ │
|
|
│ │ │ │ - action: LONG / SHORT / HOLD │││ │
|
|
│ │ │ │ - confidence: 0-1 │││ │
|
|
│ │ │ │ - entry_price │││ │
|
|
│ │ │ │ - stop_loss │││ │
|
|
│ │ │ │ - take_profit │││ │
|
|
│ │ │ │ - position_size │││ │
|
|
│ │ │ │ - reasoning: [explanations] │││ │
|
|
│ │ │ └─────────────────────────────────────────────────┘││ │
|
|
│ │ └─────────────────────────────────────────────────────┘│ │
|
|
│ │ │ │
|
|
│ └──────────────────────────┬──────────────────────────────┘ │
|
|
│ │ │
|
|
└─────────────────────────────┼──────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ TRADING SIGNAL OUTPUT │
|
|
│ ┌─────────────────────────────────────────────────────────────┐│
|
|
│ │ { ││
|
|
│ │ "action": "LONG", ││
|
|
│ │ "symbol": "XAUUSD", ││
|
|
│ │ "confidence": 0.78, ││
|
|
│ │ "entry_price": 2650.50, ││
|
|
│ │ "stop_loss": 2645.20, ││
|
|
│ │ "take_profit": 2661.10, ││
|
|
│ │ "position_size": 0.15, ││
|
|
│ │ "risk_reward": 2.0, ││
|
|
│ │ "amd_phase": "accumulation", ││
|
|
│ │ "killzone": "london_open", ││
|
|
│ │ "reasoning": ["AMD: Accumulation (78%)", "ICT: OTE", ...] ││
|
|
│ │ } ││
|
|
│ └─────────────────────────────────────────────────────────────┘│
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Pipeline de Modelos por Niveles
|
|
|
|
### Nivel 0: Data Ingestion
|
|
|
|
**Fuentes de datos:**
|
|
|
|
| Fuente | Tipo | Uso | Frecuencia |
|
|
|--------|------|-----|------------|
|
|
| API Massive | Historico | Entrenamiento, Backtesting | Batch diario |
|
|
| MetaTrader4 (MetaAPI) | Live | Inferencia | Real-time |
|
|
| PostgreSQL | Cache | Datos procesados | Continuo |
|
|
|
|
**Simbolos soportados:**
|
|
- XAUUSD (Oro)
|
|
- EURUSD
|
|
- GBPUSD
|
|
- USDJPY
|
|
|
|
**Timeframes:**
|
|
- 5 minutos (principal)
|
|
- 15 minutos
|
|
- 1 hora
|
|
- 4 horas
|
|
|
|
### Nivel 1: Feature Engineering
|
|
|
|
**Total: 103 features**
|
|
|
|
| Categoria | Cantidad | Modelos destino |
|
|
|-----------|----------|-----------------|
|
|
| Price Action | 12 | AMD, Range |
|
|
| Volume | 10 | AMD, Range |
|
|
| Volatility | 8 | AMD, Range, TPSL |
|
|
| Trend | 10 | AMD, Range |
|
|
| Market Structure | 12 | AMD, SMC |
|
|
| Order Flow | 10 | AMD, Liquidity |
|
|
| Liquidity | 8 | Liquidity, TPSL |
|
|
| ICT | 15 | ICT Context, Range |
|
|
| SMC | 12 | AMD, TPSL |
|
|
| Time | 6 | Todos |
|
|
|
|
### Nivel 2: Modelos Primarios
|
|
|
|
**Ejecucion: Paralela**
|
|
|
|
| Modelo | Tipo | Input | Output | Metricas |
|
|
|--------|------|-------|--------|----------|
|
|
| AMDDetector | XGBoost Multiclass | 50 features | phase_proba[4], confidence | Acc >70% |
|
|
| LiquidityHunter | XGBoost Binary | 30 features | sweep_proba[2] | Prec >70% |
|
|
| ICTContextModel | Rules-based | ICT features | context_score (0-1) | N/A |
|
|
| OrderFlowModel | LSTM (opcional) | Sequence(50, 10) | flow_score | N/A |
|
|
|
|
**Outputs del Nivel 2:**
|
|
|
|
```python
|
|
level2_outputs = {
|
|
# AMDDetector
|
|
'amd_phase': 'accumulation', # or manipulation, distribution, neutral
|
|
'amd_prob_neutral': 0.05,
|
|
'amd_prob_accumulation': 0.78,
|
|
'amd_prob_manipulation': 0.12,
|
|
'amd_prob_distribution': 0.05,
|
|
'amd_confidence': 0.78,
|
|
|
|
# LiquidityHunter
|
|
'liq_prob_bsl_sweep': 0.35,
|
|
'liq_prob_ssl_sweep': 0.62,
|
|
|
|
# ICTContextModel
|
|
'ict_context_score': 0.72,
|
|
'ict_killzone': 'london_open',
|
|
'ict_ote_zone': 'discount',
|
|
|
|
# OrderFlowModel (optional)
|
|
'flow_score': 0.65,
|
|
'flow_direction': 'bullish'
|
|
}
|
|
```
|
|
|
|
### Nivel 3: Modelos Secundarios (Stacking)
|
|
|
|
**Ejecucion: Secuencial (depende de Nivel 2)**
|
|
|
|
| Modelo | Input Base | Input Stacking | Output |
|
|
|--------|------------|----------------|--------|
|
|
| RangePredictor | 103 features | +12 L2 outputs = 115 | delta_high, delta_low |
|
|
| TPSLClassifier | 115 features | +8 Range outputs = 123 | prob_tp_first |
|
|
|
|
**Feature Augmentation para Nivel 3:**
|
|
|
|
```python
|
|
def augment_features_for_level3(base_features, level2_outputs):
|
|
"""
|
|
Aumenta features con outputs del nivel 2
|
|
"""
|
|
augmented = base_features.copy()
|
|
|
|
# AMD features (5)
|
|
augmented['amd_prob_neutral'] = level2_outputs['amd_prob_neutral']
|
|
augmented['amd_prob_accumulation'] = level2_outputs['amd_prob_accumulation']
|
|
augmented['amd_prob_manipulation'] = level2_outputs['amd_prob_manipulation']
|
|
augmented['amd_prob_distribution'] = level2_outputs['amd_prob_distribution']
|
|
augmented['amd_confidence'] = level2_outputs['amd_confidence']
|
|
|
|
# Liquidity features (2)
|
|
augmented['liq_prob_bsl'] = level2_outputs['liq_prob_bsl_sweep']
|
|
augmented['liq_prob_ssl'] = level2_outputs['liq_prob_ssl_sweep']
|
|
|
|
# ICT features (3)
|
|
augmented['ict_context_score'] = level2_outputs['ict_context_score']
|
|
augmented['ict_killzone_encoded'] = encode_killzone(level2_outputs['ict_killzone'])
|
|
augmented['ict_ote_encoded'] = encode_ote_zone(level2_outputs['ict_ote_zone'])
|
|
|
|
# Flow features (2) - optional
|
|
augmented['flow_score'] = level2_outputs.get('flow_score', 0)
|
|
augmented['flow_direction'] = level2_outputs.get('flow_direction_encoded', 0)
|
|
|
|
return augmented # 115 features total
|
|
```
|
|
|
|
**RangePredictor Outputs:**
|
|
|
|
```python
|
|
range_outputs = {
|
|
# 15 min horizon (3 bars)
|
|
'delta_high_15m': 0.0085, # +0.85%
|
|
'delta_low_15m': 0.0042, # -0.42%
|
|
'bin_high_15m': 2, # Medium
|
|
'bin_low_15m': 1, # Low
|
|
|
|
# 1 hour horizon (12 bars)
|
|
'delta_high_1h': 0.0142, # +1.42%
|
|
'delta_low_1h': 0.0078, # -0.78%
|
|
'bin_high_1h': 3, # High
|
|
'bin_low_1h': 2, # Medium
|
|
}
|
|
```
|
|
|
|
**TPSLClassifier Outputs:**
|
|
|
|
```python
|
|
tpsl_outputs = {
|
|
# 15 min horizon, R:R 2:1
|
|
'tp_prob_15m_rr2': 0.68,
|
|
|
|
# 15 min horizon, R:R 3:1
|
|
'tp_prob_15m_rr3': 0.54,
|
|
|
|
# 1 hour horizon, R:R 2:1
|
|
'tp_prob_1h_rr2': 0.72,
|
|
|
|
# 1 hour horizon, R:R 3:1
|
|
'tp_prob_1h_rr3': 0.61,
|
|
}
|
|
```
|
|
|
|
### Nivel 4: Meta-Modelo
|
|
|
|
**StrategyOrchestrator:**
|
|
|
|
```python
|
|
class StrategyOrchestrator:
|
|
"""
|
|
Meta-modelo que combina todos los outputs
|
|
para generar la senal final de trading
|
|
"""
|
|
|
|
def __init__(self, config=None):
|
|
self.config = config or {
|
|
'weights': {
|
|
'amd': 0.30,
|
|
'ict': 0.20,
|
|
'range': 0.20,
|
|
'tpsl': 0.20,
|
|
'liquidity': 0.10
|
|
},
|
|
'thresholds': {
|
|
'min_amd_confidence': 0.65,
|
|
'min_tp_probability': 0.55,
|
|
'min_overall_confidence': 0.60,
|
|
'max_liquidity_risk': 0.70
|
|
},
|
|
'risk': {
|
|
'max_position_pct': 0.02, # 2% del account
|
|
'default_rr': 2.0
|
|
}
|
|
}
|
|
|
|
def generate_signal(self, level2_outputs, level3_outputs, current_price, atr):
|
|
"""
|
|
Pipeline de decision completo
|
|
"""
|
|
signal = {
|
|
'action': 'HOLD',
|
|
'confidence': 0.0,
|
|
'reasoning': []
|
|
}
|
|
|
|
# STEP 1: AMD Phase Filter
|
|
amd_phase = level2_outputs['amd_phase']
|
|
amd_conf = level2_outputs['amd_confidence']
|
|
|
|
if amd_phase == 'manipulation':
|
|
signal['reasoning'].append(f'AMD: Manipulation phase detected ({amd_conf:.0%}) - avoiding trade')
|
|
return signal
|
|
|
|
if amd_phase == 'neutral':
|
|
signal['reasoning'].append(f'AMD: Neutral phase ({amd_conf:.0%}) - no clear direction')
|
|
return signal
|
|
|
|
if amd_conf < self.config['thresholds']['min_amd_confidence']:
|
|
signal['reasoning'].append(f'AMD: Low confidence ({amd_conf:.0%})')
|
|
return signal
|
|
|
|
# Determine bias
|
|
bias = 'bullish' if amd_phase == 'accumulation' else 'bearish'
|
|
signal['reasoning'].append(f'AMD: {amd_phase.capitalize()} ({amd_conf:.0%}) - {bias} bias')
|
|
|
|
# STEP 2: ICT Context Alignment
|
|
ict_score = level2_outputs['ict_context_score']
|
|
ict_killzone = level2_outputs['ict_killzone']
|
|
ict_ote = level2_outputs['ict_ote_zone']
|
|
|
|
if ict_score > 0.6:
|
|
signal['reasoning'].append(f'ICT: High context score ({ict_score:.0%}), {ict_killzone}, {ict_ote} zone')
|
|
else:
|
|
signal['reasoning'].append(f'ICT: Moderate context ({ict_score:.0%})')
|
|
|
|
# Check OTE alignment with bias
|
|
ote_aligned = (bias == 'bullish' and ict_ote in ['discount', 'extreme_discount']) or \
|
|
(bias == 'bearish' and ict_ote in ['premium', 'extreme_premium'])
|
|
|
|
if not ote_aligned:
|
|
signal['reasoning'].append(f'ICT: OTE zone ({ict_ote}) not aligned with {bias} bias')
|
|
# Reduce confidence but don't exit
|
|
ict_score *= 0.5
|
|
|
|
# STEP 3: Range Prediction Alignment
|
|
delta_high = level3_outputs['delta_high_15m']
|
|
delta_low = level3_outputs['delta_low_15m']
|
|
|
|
range_aligned = (bias == 'bullish' and delta_high > delta_low * 1.5) or \
|
|
(bias == 'bearish' and delta_low > delta_high * 1.5)
|
|
|
|
if range_aligned:
|
|
signal['reasoning'].append(f'Range: Prediction aligned (high: {delta_high:.2%}, low: {delta_low:.2%})')
|
|
else:
|
|
signal['reasoning'].append(f'Range: Weak alignment (high: {delta_high:.2%}, low: {delta_low:.2%})')
|
|
# Don't exit, just note
|
|
|
|
# STEP 4: TPSL Probability Check
|
|
tp_prob = level3_outputs['tp_prob_15m_rr2']
|
|
|
|
if tp_prob < self.config['thresholds']['min_tp_probability']:
|
|
signal['reasoning'].append(f'TPSL: Low probability ({tp_prob:.0%}) - avoiding trade')
|
|
return signal
|
|
|
|
signal['reasoning'].append(f'TPSL: Good probability ({tp_prob:.0%})')
|
|
|
|
# STEP 5: Liquidity Risk Check
|
|
liq_bsl = level2_outputs['liq_prob_bsl_sweep']
|
|
liq_ssl = level2_outputs['liq_prob_ssl_sweep']
|
|
|
|
if bias == 'bullish' and liq_ssl > self.config['thresholds']['max_liquidity_risk']:
|
|
signal['reasoning'].append(f'Liquidity: High SSL sweep risk ({liq_ssl:.0%}) - reduce size')
|
|
position_multiplier = 0.5
|
|
elif bias == 'bearish' and liq_bsl > self.config['thresholds']['max_liquidity_risk']:
|
|
signal['reasoning'].append(f'Liquidity: High BSL sweep risk ({liq_bsl:.0%}) - reduce size')
|
|
position_multiplier = 0.5
|
|
else:
|
|
position_multiplier = 1.0
|
|
|
|
# STEP 6: Calculate Overall Confidence
|
|
confidence = 0.0
|
|
confidence += self.config['weights']['amd'] * amd_conf
|
|
confidence += self.config['weights']['ict'] * ict_score
|
|
confidence += self.config['weights']['range'] * (1.0 if range_aligned else 0.5)
|
|
confidence += self.config['weights']['tpsl'] * tp_prob
|
|
confidence += self.config['weights']['liquidity'] * (1 - max(liq_bsl, liq_ssl))
|
|
|
|
if confidence < self.config['thresholds']['min_overall_confidence']:
|
|
signal['reasoning'].append(f'Overall confidence too low ({confidence:.0%})')
|
|
return signal
|
|
|
|
# STEP 7: Generate Trading Signal
|
|
signal['action'] = 'LONG' if bias == 'bullish' else 'SHORT'
|
|
signal['confidence'] = confidence
|
|
signal['entry_price'] = current_price
|
|
|
|
# Calculate SL/TP based on ATR
|
|
sl_distance = atr * 0.3 # 0.3 ATR for SL
|
|
tp_distance = atr * 0.6 # 0.6 ATR for TP (R:R 2:1)
|
|
|
|
if bias == 'bullish':
|
|
signal['stop_loss'] = current_price - sl_distance
|
|
signal['take_profit'] = current_price + tp_distance
|
|
else:
|
|
signal['stop_loss'] = current_price + sl_distance
|
|
signal['take_profit'] = current_price - tp_distance
|
|
|
|
# Position sizing
|
|
risk_per_trade = self.config['risk']['max_position_pct']
|
|
price_risk_pct = sl_distance / current_price
|
|
signal['position_size'] = (risk_per_trade / price_risk_pct) * position_multiplier
|
|
|
|
signal['risk_reward'] = tp_distance / sl_distance
|
|
signal['amd_phase'] = amd_phase
|
|
signal['killzone'] = ict_killzone
|
|
|
|
signal['reasoning'].append(f'Signal: {signal["action"]} at {current_price:.2f}')
|
|
signal['reasoning'].append(f'SL: {signal["stop_loss"]:.2f}, TP: {signal["take_profit"]:.2f}, R:R: {signal["risk_reward"]:.1f}')
|
|
signal['reasoning'].append(f'Position: {signal["position_size"]:.1%}, Confidence: {confidence:.0%}')
|
|
|
|
return signal
|
|
```
|
|
|
|
---
|
|
|
|
## Dependencias entre Modelos
|
|
|
|
### Grafo de Dependencias
|
|
|
|
```
|
|
┌────────────────┐
|
|
│ Raw OHLCV │
|
|
│ Data │
|
|
└───────┬────────┘
|
|
│
|
|
▼
|
|
┌────────────────┐
|
|
│ Feature │
|
|
│ Engineering │
|
|
└───────┬────────┘
|
|
│
|
|
┌────────────────┼────────────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
│ AMDDetector │ │LiquidityHunt │ │ICTContext │
|
|
│ │ │ │ │ │
|
|
│ (Independiente) │ (Independiente) │ (Independiente) │
|
|
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
|
|
│ │ │
|
|
│ │ │
|
|
└────────────────┼────────────────┘
|
|
│
|
|
▼
|
|
┌────────────────┐
|
|
│ Feature │
|
|
│ Augmentation │
|
|
│ (115 feat) │
|
|
└───────┬────────┘
|
|
│
|
|
▼
|
|
┌────────────────┐
|
|
│RangePredictor │◄───┐
|
|
│ │ │
|
|
│ (Depende L2) │ │
|
|
└───────┬────────┘ │
|
|
│ │
|
|
│ (stacking) │
|
|
▼ │
|
|
┌────────────────┐ │
|
|
│TPSLClassifier │────┘
|
|
│ │
|
|
│(Depende L2+Range)│
|
|
└───────┬────────┘
|
|
│
|
|
│
|
|
▼
|
|
┌────────────────┐
|
|
│Strategy │
|
|
│Orchestrator │
|
|
│ │
|
|
│(Depende todos) │
|
|
└───────┬────────┘
|
|
│
|
|
▼
|
|
┌────────────────┐
|
|
│ Trading Signal │
|
|
└────────────────┘
|
|
```
|
|
|
|
### Orden de Ejecucion
|
|
|
|
**Entrenamiento:**
|
|
```
|
|
1. Feature Engineering Pipeline (fit_transform)
|
|
2. AMDDetector.fit() ─┐
|
|
3. LiquidityHunter.fit() ├── Paralelo
|
|
4. ICTContextModel.fit() ─┘
|
|
5. Generate Level 2 outputs for training data
|
|
6. RangePredictor.fit() (with augmented features)
|
|
7. Generate Range outputs
|
|
8. TPSLClassifier.fit() (with Range outputs)
|
|
9. StrategyOrchestrator.calibrate() (optional)
|
|
```
|
|
|
|
**Inferencia:**
|
|
```
|
|
1. Feature Engineering Pipeline (transform)
|
|
2. AMDDetector.predict() ─┐
|
|
3. LiquidityHunter.predict() ├── Paralelo
|
|
4. ICTContextModel.score() ─┘
|
|
5. Augment features with Level 2 outputs
|
|
6. RangePredictor.predict()
|
|
7. Augment features with Range outputs
|
|
8. TPSLClassifier.predict()
|
|
9. StrategyOrchestrator.generate_signal()
|
|
```
|
|
|
|
---
|
|
|
|
## Pipeline de Entrenamiento
|
|
|
|
### Workflow Completo
|
|
|
|
```python
|
|
class MLTrainingPipeline:
|
|
"""
|
|
Pipeline completo de entrenamiento
|
|
"""
|
|
|
|
def __init__(self, data_path, config):
|
|
self.data_path = data_path
|
|
self.config = config
|
|
self.models = {}
|
|
self.feature_pipeline = FeatureEngineeringPipeline()
|
|
|
|
def run(self):
|
|
"""
|
|
Ejecuta pipeline completo de entrenamiento
|
|
"""
|
|
print("=" * 60)
|
|
print("ORBIQUANT ML TRAINING PIPELINE v2.0")
|
|
print("=" * 60)
|
|
|
|
# 1. Load data
|
|
print("\n[1/10] Loading data...")
|
|
df = self.load_data()
|
|
print(f" Loaded {len(df):,} records")
|
|
|
|
# 2. Feature engineering
|
|
print("\n[2/10] Feature engineering...")
|
|
X = self.feature_pipeline.fit_transform(df, df.index)
|
|
print(f" Generated {X.shape[1]} features")
|
|
|
|
# 3. Label targets
|
|
print("\n[3/10] Labeling targets...")
|
|
targets = self.label_all_targets(df)
|
|
|
|
# 4. Temporal split
|
|
print("\n[4/10] Temporal split...")
|
|
splits = self.temporal_split(X, targets)
|
|
X_train, X_val, X_test = splits['X']
|
|
y_train, y_val, y_test = splits['y']
|
|
print(f" Train: {len(X_train):,}, Val: {len(X_val):,}, Test: {len(X_test):,}")
|
|
|
|
# 5. Train Level 2 models (parallel)
|
|
print("\n[5/10] Training Level 2 models...")
|
|
self.train_level2_models(X_train, y_train, X_val, y_val)
|
|
|
|
# 6. Generate Level 2 outputs
|
|
print("\n[6/10] Generating Level 2 outputs...")
|
|
l2_train = self.predict_level2(X_train)
|
|
l2_val = self.predict_level2(X_val)
|
|
l2_test = self.predict_level2(X_test)
|
|
|
|
# 7. Augment features for Level 3
|
|
print("\n[7/10] Augmenting features...")
|
|
X_train_aug = self.augment_features(X_train, l2_train)
|
|
X_val_aug = self.augment_features(X_val, l2_val)
|
|
X_test_aug = self.augment_features(X_test, l2_test)
|
|
|
|
# 8. Train RangePredictor
|
|
print("\n[8/10] Training RangePredictor...")
|
|
self.train_range_predictor(X_train_aug, y_train, X_val_aug, y_val)
|
|
|
|
# 9. Train TPSLClassifier (with Range outputs)
|
|
print("\n[9/10] Training TPSLClassifier...")
|
|
range_train = self.models['range_predictor'].predict(X_train_aug)
|
|
range_val = self.models['range_predictor'].predict(X_val_aug)
|
|
X_train_full = np.hstack([X_train_aug, range_train])
|
|
X_val_full = np.hstack([X_val_aug, range_val])
|
|
self.train_tpsl_classifier(X_train_full, y_train, X_val_full, y_val)
|
|
|
|
# 10. Evaluate all
|
|
print("\n[10/10] Final evaluation...")
|
|
metrics = self.evaluate_all(X_test, X_test_aug, X_test_full, y_test, l2_test)
|
|
|
|
print("\n" + "=" * 60)
|
|
print("TRAINING COMPLETE")
|
|
print("=" * 60)
|
|
self.print_metrics(metrics)
|
|
|
|
# Save models
|
|
self.save_models()
|
|
|
|
return metrics
|
|
|
|
def train_level2_models(self, X_train, y_train, X_val, y_val):
|
|
"""Train modelos nivel 2 en paralelo"""
|
|
|
|
# AMDDetector
|
|
print(" Training AMDDetector...")
|
|
self.models['amd_detector'] = AMDDetector(self.config['amd'])
|
|
self.models['amd_detector'].fit(
|
|
X_train[:, :50], # First 50 features
|
|
y_train['amd_phase'],
|
|
eval_set=(X_val[:, :50], y_val['amd_phase'])
|
|
)
|
|
|
|
# LiquidityHunter
|
|
print(" Training LiquidityHunter...")
|
|
liq_features = self.get_liquidity_feature_indices()
|
|
self.models['liquidity_hunter'] = LiquidityHunter(self.config['liquidity'])
|
|
self.models['liquidity_hunter'].fit(
|
|
X_train[:, liq_features],
|
|
y_train['liquidity_sweep'],
|
|
eval_set=(X_val[:, liq_features], y_val['liquidity_sweep'])
|
|
)
|
|
|
|
# ICTContextModel (rules-based, no training needed)
|
|
print(" Initializing ICTContextModel...")
|
|
self.models['ict_context'] = ICTContextModel()
|
|
|
|
def predict_level2(self, X):
|
|
"""Genera predicciones de modelos nivel 2"""
|
|
outputs = {}
|
|
|
|
# AMDDetector
|
|
amd_proba = self.models['amd_detector'].predict_proba(X[:, :50])
|
|
outputs['amd_prob_neutral'] = amd_proba[:, 0]
|
|
outputs['amd_prob_accumulation'] = amd_proba[:, 1]
|
|
outputs['amd_prob_manipulation'] = amd_proba[:, 2]
|
|
outputs['amd_prob_distribution'] = amd_proba[:, 3]
|
|
outputs['amd_confidence'] = amd_proba.max(axis=1)
|
|
|
|
# LiquidityHunter
|
|
liq_features = self.get_liquidity_feature_indices()
|
|
liq_proba = self.models['liquidity_hunter'].predict_proba(X[:, liq_features])
|
|
outputs['liq_prob_bsl'] = liq_proba[:, 0]
|
|
outputs['liq_prob_ssl'] = liq_proba[:, 1]
|
|
|
|
# ICTContext
|
|
ict_features = self.get_ict_feature_indices()
|
|
outputs['ict_context_score'] = self.models['ict_context'].score(X[:, ict_features])
|
|
|
|
return np.column_stack([outputs[k] for k in sorted(outputs.keys())])
|
|
```
|
|
|
|
---
|
|
|
|
## Pipeline de Inferencia
|
|
|
|
### Real-Time Inference
|
|
|
|
```python
|
|
class MLInferencePipeline:
|
|
"""
|
|
Pipeline de inferencia en tiempo real
|
|
"""
|
|
|
|
def __init__(self, models_path):
|
|
self.load_models(models_path)
|
|
self.feature_pipeline = FeatureEngineeringPipeline()
|
|
self.last_signal = None
|
|
self.signal_cooldown = 60 # segundos
|
|
|
|
def load_models(self, path):
|
|
"""Carga todos los modelos entrenados"""
|
|
self.models = {
|
|
'amd_detector': load_model(f'{path}/amd_detector.pkl'),
|
|
'liquidity_hunter': load_model(f'{path}/liquidity_hunter.pkl'),
|
|
'ict_context': ICTContextModel(),
|
|
'range_predictor': load_model(f'{path}/range_predictor.pkl'),
|
|
'tpsl_classifier': load_model(f'{path}/tpsl_classifier.pkl'),
|
|
'orchestrator': StrategyOrchestrator()
|
|
}
|
|
self.feature_pipeline.load_scaler(f'{path}/scaler.pkl')
|
|
|
|
def predict(self, market_data, timestamps):
|
|
"""
|
|
Genera prediccion para datos de mercado
|
|
|
|
Args:
|
|
market_data: DataFrame con OHLCV (minimo 100 bars)
|
|
timestamps: DatetimeIndex con timestamps
|
|
|
|
Returns:
|
|
Trading signal dict
|
|
"""
|
|
# 1. Feature engineering
|
|
X = self.feature_pipeline.transform(market_data, timestamps)
|
|
X_latest = X[-1:] # Only latest bar
|
|
|
|
# 2. Level 2 predictions
|
|
amd_proba = self.models['amd_detector'].predict_proba(X_latest[:, :50])
|
|
amd_phase = ['neutral', 'accumulation', 'manipulation', 'distribution'][amd_proba.argmax()]
|
|
|
|
liq_features = self.get_liquidity_feature_indices()
|
|
liq_proba = self.models['liquidity_hunter'].predict_proba(X_latest[:, liq_features])
|
|
|
|
ict_features = self.get_ict_feature_indices()
|
|
ict_score = self.models['ict_context'].score(X_latest[:, ict_features])
|
|
ict_killzone = identify_killzone(timestamps[-1])
|
|
ict_ote = calculate_ote_zone(market_data)
|
|
|
|
level2_outputs = {
|
|
'amd_phase': amd_phase,
|
|
'amd_prob_neutral': amd_proba[0, 0],
|
|
'amd_prob_accumulation': amd_proba[0, 1],
|
|
'amd_prob_manipulation': amd_proba[0, 2],
|
|
'amd_prob_distribution': amd_proba[0, 3],
|
|
'amd_confidence': amd_proba.max(),
|
|
'liq_prob_bsl_sweep': liq_proba[0, 0],
|
|
'liq_prob_ssl_sweep': liq_proba[0, 1],
|
|
'ict_context_score': ict_score[0],
|
|
'ict_killzone': ict_killzone,
|
|
'ict_ote_zone': ict_ote
|
|
}
|
|
|
|
# 3. Augment features
|
|
X_aug = self.augment_features(X_latest, level2_outputs)
|
|
|
|
# 4. Level 3 predictions
|
|
range_pred = self.models['range_predictor'].predict(X_aug)
|
|
|
|
X_full = np.hstack([X_aug, range_pred])
|
|
tpsl_pred = self.models['tpsl_classifier'].predict_proba(X_full)
|
|
|
|
level3_outputs = {
|
|
'delta_high_15m': range_pred[0, 0],
|
|
'delta_low_15m': range_pred[0, 1],
|
|
'delta_high_1h': range_pred[0, 2],
|
|
'delta_low_1h': range_pred[0, 3],
|
|
'tp_prob_15m_rr2': tpsl_pred[0, 0],
|
|
'tp_prob_15m_rr3': tpsl_pred[0, 1],
|
|
'tp_prob_1h_rr2': tpsl_pred[0, 2],
|
|
'tp_prob_1h_rr3': tpsl_pred[0, 3]
|
|
}
|
|
|
|
# 5. Generate signal
|
|
current_price = market_data['close'].iloc[-1]
|
|
atr = calculate_atr(market_data, 14).iloc[-1]
|
|
|
|
signal = self.models['orchestrator'].generate_signal(
|
|
level2_outputs,
|
|
level3_outputs,
|
|
current_price,
|
|
atr
|
|
)
|
|
|
|
# Add metadata
|
|
signal['timestamp'] = timestamps[-1]
|
|
signal['symbol'] = market_data.attrs.get('symbol', 'UNKNOWN')
|
|
signal['model_outputs'] = {
|
|
'level2': level2_outputs,
|
|
'level3': level3_outputs
|
|
}
|
|
|
|
return signal
|
|
```
|
|
|
|
---
|
|
|
|
## Integracion con Agente LLM
|
|
|
|
### Flujo de Integracion
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ LLM AGENT INTEGRATION │
|
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────────────────┐ ┌──────────────────────┐ │
|
|
│ │ ML Engine │ │ LLM Agent │ │
|
|
│ │ (FastAPI) │◄──▶│ (chatgpt-oss) │ │
|
|
│ │ │ │ │ │
|
|
│ │ /api/signal │ │ Trading Tools: │ │
|
|
│ │ /api/analysis │ │ - get_ml_signal() │ │
|
|
│ │ /api/models/status │ │ - analyze_market() │ │
|
|
│ └──────────────────────┘ │ - explain_signal() │ │
|
|
│ │ - execute_trade() │ │
|
|
│ └──────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────────────────┐ │
|
|
│ │ User Interface │ │
|
|
│ │ (Chat / Commands) │ │
|
|
│ └──────────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Tools para LLM
|
|
|
|
```python
|
|
# Tool: get_ml_signal
|
|
def get_ml_signal(symbol: str, timeframe: str = '5m') -> dict:
|
|
"""
|
|
Obtiene senal ML actual para un simbolo
|
|
|
|
Args:
|
|
symbol: Par de trading (XAUUSD, EURUSD, etc.)
|
|
timeframe: Timeframe (5m, 15m, 1h)
|
|
|
|
Returns:
|
|
Trading signal con explicacion
|
|
"""
|
|
response = requests.get(f'{ML_ENGINE_URL}/api/signal', params={
|
|
'symbol': symbol,
|
|
'timeframe': timeframe
|
|
})
|
|
return response.json()
|
|
|
|
# Tool: analyze_market
|
|
def analyze_market(symbol: str) -> dict:
|
|
"""
|
|
Analiza estado actual del mercado
|
|
|
|
Returns:
|
|
- AMD phase
|
|
- ICT context
|
|
- Key levels
|
|
- Recent signals
|
|
"""
|
|
response = requests.get(f'{ML_ENGINE_URL}/api/analysis', params={
|
|
'symbol': symbol
|
|
})
|
|
return response.json()
|
|
|
|
# Tool: explain_signal
|
|
def explain_signal(signal: dict) -> str:
|
|
"""
|
|
Genera explicacion en lenguaje natural de una senal
|
|
|
|
Returns:
|
|
Explicacion detallada para el usuario
|
|
"""
|
|
# LLM genera explicacion basada en signal['reasoning']
|
|
pass
|
|
|
|
# Tool: execute_trade
|
|
def execute_trade(
|
|
symbol: str,
|
|
action: str,
|
|
size: float,
|
|
sl: float,
|
|
tp: float
|
|
) -> dict:
|
|
"""
|
|
Ejecuta trade via MetaTrader4
|
|
|
|
Returns:
|
|
Confirmacion de ejecucion
|
|
"""
|
|
response = requests.post(f'{TRADING_SERVICE_URL}/api/trade', json={
|
|
'symbol': symbol,
|
|
'action': action,
|
|
'size': size,
|
|
'stop_loss': sl,
|
|
'take_profit': tp
|
|
})
|
|
return response.json()
|
|
```
|
|
|
|
### System Prompt para LLM
|
|
|
|
```
|
|
You are OrbiQuant AI Trading Copilot, a specialized assistant for trading operations.
|
|
|
|
You have access to the following tools:
|
|
1. get_ml_signal(symbol, timeframe) - Get ML-generated trading signals
|
|
2. analyze_market(symbol) - Get market analysis
|
|
3. explain_signal(signal) - Explain a trading signal
|
|
4. execute_trade(symbol, action, size, sl, tp) - Execute a trade
|
|
|
|
When analyzing signals, always explain:
|
|
1. The AMD phase and what it means
|
|
2. The ICT context (killzone, OTE zone)
|
|
3. The probability estimates from the models
|
|
4. The risk/reward ratio
|
|
5. Your recommendation
|
|
|
|
Always prioritize risk management. Never recommend trades without proper stop loss.
|
|
|
|
Current market context:
|
|
- AMD phases: Accumulation (building positions), Manipulation (stop hunting),
|
|
Distribution (selling)
|
|
- ICT killzones: London Open, NY AM are highest probability
|
|
- Minimum confidence threshold: 60%
|
|
```
|
|
|
|
---
|
|
|
|
**Documento Generado:** 2025-12-08
|
|
**Trading Strategist - OrbiQuant IA**
|