ML Engine Updates: - Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records - Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence) - Backtest results: +176.71R profit with aggressive_filter strategy Documentation Consolidation: - Created docs/99-analisis/_MAP.md index with 13 new analysis documents - Consolidated inventories: removed duplicates from orchestration/inventarios/ - Updated ML_INVENTORY.yml with BTCUSD metrics and training results - Added execution reports: FASE11-BTCUSD, correction issues, alignment validation Architecture & Integration: - Updated all module documentation with NEXUS v3.4 frontmatter - Fixed _MAP.md indexes across all folders - Updated orchestration plans and traces Files: 229 changed, 5064 insertions(+), 1872 deletions(-) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1722 lines
58 KiB
Markdown
1722 lines
58 KiB
Markdown
---
|
|
id: "MODELOS-ML-DEFINICION"
|
|
title: "Arquitectura de Modelos ML - Trading Platform"
|
|
type: "Documentation"
|
|
project: "trading-platform"
|
|
version: "1.0.0"
|
|
updated_date: "2026-01-04"
|
|
---
|
|
|
|
# Arquitectura de Modelos ML - Trading Platform
|
|
|
|
**Versi\u00f3n:** 1.0.0
|
|
**Fecha:** 2025-12-05
|
|
**M\u00f3dulo:** OQI-006-ml-signals
|
|
**Autor:** Trading Strategist - Trading Platform
|
|
|
|
---
|
|
|
|
## Tabla de Contenidos
|
|
|
|
1. [Visi\u00f3n General](#visi\u00f3n-general)
|
|
2. [Modelo 1: AMDDetector](#modelo-1-amddetector)
|
|
3. [Modelo 2: RangePredictor](#modelo-2-rangepredictor)
|
|
4. [Modelo 3: TPSLClassifier](#modelo-3-tpslclassifier)
|
|
5. [Modelo 4: LiquidityHunter](#modelo-4-liquidityhunter)
|
|
6. [Modelo 5: OrderFlowAnalyzer](#modelo-5-orderflowanalyzer)
|
|
7. [Meta-Modelo: StrategyOrchestrator](#meta-modelo-strategyorchestrator)
|
|
8. [Pipeline de Entrenamiento](#pipeline-de-entrenamiento)
|
|
9. [M\u00e9tricas y Evaluaci\u00f3n](#m\u00e9tricas-y-evaluaci\u00f3n)
|
|
10. [Producci\u00f3n y Deployment](#producci\u00f3n-y-deployment)
|
|
|
|
---
|
|
|
|
## Visi\u00f3n General
|
|
|
|
### Arquitectura del Sistema
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ ORBIQUANT IA ML SYSTEM │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
|
│ │ AMD │ │ Liquidity │ │ OrderFlow │ │
|
|
│ │ Detector │ │ Hunter │ │ Analyzer │ │
|
|
│ │ (Phase) │ │ (Hunt) │ │ (Flow) │ │
|
|
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
|
|
│ │ │ │ │
|
|
│ └────────────────┼────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌─────────────────┐ │
|
|
│ │ Feature Union │ │
|
|
│ │ (Combined) │ │
|
|
│ └────────┬────────┘ │
|
|
│ │ │
|
|
│ ┌───────────────┴───────────────┐ │
|
|
│ │ │ │
|
|
│ ▼ ▼ │
|
|
│ ┌─────────────┐ ┌─────────────┐ │
|
|
│ │ Range │ │ TPSL │ │
|
|
│ │ Predictor │ │ Classifier │ │
|
|
│ │ (ΔH/ΔL) │ │ (P[TP]) │ │
|
|
│ └──────┬──────┘ └──────┬──────┘ │
|
|
│ │ │ │
|
|
│ └───────────────┬───────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌─────────────────┐ │
|
|
│ │ Strategy │ │
|
|
│ │ Orchestrator │ │
|
|
│ │ (Meta-Model) │ │
|
|
│ └────────┬────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌─────────────────┐ │
|
|
│ │ Signal Output │ │
|
|
│ │ BUY/SELL/HOLD │ │
|
|
│ └─────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Flujo de Datos
|
|
|
|
```
|
|
Market Data (OHLCV)
|
|
│
|
|
▼
|
|
Feature Engineering
|
|
(50+ features)
|
|
│
|
|
├─────────────┬─────────────┬─────────────┐
|
|
▼ ▼ ▼ ▼
|
|
AMDDetector Liquidity OrderFlow Base
|
|
Hunter Analyzer Features
|
|
│ │ │ │
|
|
└─────────────┴─────────────┴─────────────┘
|
|
│
|
|
▼
|
|
Combined Feature Vector
|
|
(100+ dims)
|
|
│
|
|
┌─────────────┴─────────────┐
|
|
▼ ▼
|
|
RangePredictor TPSLClassifier
|
|
│ │
|
|
└─────────────┬─────────────┘
|
|
▼
|
|
StrategyOrchestrator
|
|
│
|
|
▼
|
|
Trading Signal
|
|
```
|
|
|
|
### Principios de Dise\u00f1o
|
|
|
|
1. **Modular**: Cada modelo es independiente y reutilizable
|
|
2. **Escalable**: F\u00e1cil agregar nuevos modelos
|
|
3. **Interpretable**: Feature importance y explicabilidad
|
|
4. **Robusto**: Validaci\u00f3n temporal estricta (no look-ahead bias)
|
|
5. **Production-Ready**: API, monitoring, retraining autom\u00e1tico
|
|
|
|
---
|
|
|
|
## Modelo 1: AMDDetector
|
|
|
|
### Descripci\u00f3n
|
|
|
|
Clasificador multiclass que identifica la fase actual del mercado seg\u00fan el framework AMD (Accumulation-Manipulation-Distribution).
|
|
|
|
### Arquitectura
|
|
|
|
**Tipo:** XGBoost Multiclass Classifier
|
|
**Output:** Probabilities para 4 clases
|
|
|
|
```python
|
|
from xgboost import XGBClassifier
|
|
|
|
class AMDDetector:
|
|
"""
|
|
Detecta fases AMD usando XGBoost
|
|
"""
|
|
|
|
def __init__(self, config=None):
|
|
self.config = config or self._default_config()
|
|
self.model = self._init_model()
|
|
self.scaler = RobustScaler()
|
|
self.label_encoder = {
|
|
0: 'neutral',
|
|
1: 'accumulation',
|
|
2: 'manipulation',
|
|
3: 'distribution'
|
|
}
|
|
|
|
def _init_model(self):
|
|
return XGBClassifier(
|
|
objective='multi:softprob',
|
|
num_class=4,
|
|
n_estimators=300,
|
|
max_depth=6,
|
|
learning_rate=0.05,
|
|
subsample=0.8,
|
|
colsample_bytree=0.8,
|
|
min_child_weight=5,
|
|
gamma=0.2,
|
|
reg_alpha=0.1,
|
|
reg_lambda=1.0,
|
|
scale_pos_weight=1.0,
|
|
tree_method='hist',
|
|
device='cuda', # GPU support
|
|
random_state=42
|
|
)
|
|
```
|
|
|
|
### Input Features
|
|
|
|
**Dimensi\u00f3n:** 50 features
|
|
|
|
| Categor\u00eda | Features | Cantidad |
|
|
|-----------|----------|----------|
|
|
| **Price Action** | range_ratio, body_size, wicks, etc. | 10 |
|
|
| **Volume** | volume_ratio, trend, OBV, etc. | 8 |
|
|
| **Volatility** | ATR, volatility_*, percentiles | 6 |
|
|
| **Trend** | SMAs, slopes, strength | 8 |
|
|
| **Market Structure** | higher_highs, lower_lows, BOS, CHOCH | 10 |
|
|
| **Order Flow** | order_blocks, FVG, liquidity_grabs | 8 |
|
|
|
|
```python
|
|
def extract_amd_features(df):
|
|
"""
|
|
Extrae features para AMDDetector
|
|
"""
|
|
features = {}
|
|
|
|
# Price action
|
|
features['range_ratio'] = (df['high'] - df['low']) / df['high'].rolling(20).mean()
|
|
features['body_size'] = abs(df['close'] - df['open']) / (df['high'] - df['low'])
|
|
features['upper_wick'] = (df['high'] - df[['close', 'open']].max(axis=1)) / (df['high'] - df['low'])
|
|
features['lower_wick'] = (df[['close', 'open']].min(axis=1) - df['low']) / (df['high'] - df['low'])
|
|
features['buying_pressure'] = (df['close'] - df['low']) / (df['high'] - df['low'])
|
|
features['selling_pressure'] = (df['high'] - df['close']) / (df['high'] - df['low'])
|
|
|
|
# Volume
|
|
features['volume_ratio'] = df['volume'] / df['volume'].rolling(20).mean()
|
|
features['volume_trend'] = df['volume'].rolling(10).mean() - df['volume'].rolling(30).mean()
|
|
features['obv'] = (df['volume'] * ((df['close'] > df['close'].shift(1)).astype(int) * 2 - 1)).cumsum()
|
|
features['obv_slope'] = features['obv'].diff(5) / 5
|
|
|
|
# Volatility
|
|
features['atr'] = calculate_atr(df, 14)
|
|
features['atr_ratio'] = features['atr'] / features['atr'].rolling(50).mean()
|
|
features['volatility_10'] = df['close'].pct_change().rolling(10).std()
|
|
features['volatility_20'] = df['close'].pct_change().rolling(20).std()
|
|
|
|
# Trend
|
|
features['sma_10'] = df['close'].rolling(10).mean()
|
|
features['sma_20'] = df['close'].rolling(20).mean()
|
|
features['sma_50'] = df['close'].rolling(50).mean()
|
|
features['close_sma_ratio_20'] = df['close'] / features['sma_20']
|
|
features['trend_slope'] = features['sma_20'].diff(5) / 5
|
|
features['trend_strength'] = abs(features['trend_slope']) / features['atr']
|
|
|
|
# Market structure
|
|
features['higher_highs'] = (df['high'] > df['high'].shift(1)).rolling(10).sum()
|
|
features['higher_lows'] = (df['low'] > df['low'].shift(1)).rolling(10).sum()
|
|
features['lower_highs'] = (df['high'] < df['high'].shift(1)).rolling(10).sum()
|
|
features['lower_lows'] = (df['low'] < df['low'].shift(1)).rolling(10).sum()
|
|
|
|
# Order flow
|
|
features['order_blocks_bullish'] = detect_order_blocks(df, 'bullish')
|
|
features['order_blocks_bearish'] = detect_order_blocks(df, 'bearish')
|
|
features['fvg_count_bullish'] = detect_fvg(df, 'bullish')
|
|
features['fvg_count_bearish'] = detect_fvg(df, 'bearish')
|
|
|
|
return pd.DataFrame(features)
|
|
```
|
|
|
|
### Target Labeling
|
|
|
|
**M\u00e9todo:** Forward-looking con ventana de 20 periodos
|
|
|
|
```python
|
|
def label_amd_phase(df, i, forward_window=20):
|
|
"""
|
|
Etiqueta fase AMD basada en comportamiento futuro
|
|
"""
|
|
if i + forward_window >= len(df):
|
|
return 0 # neutral
|
|
|
|
future = df.iloc[i:i+forward_window]
|
|
current = df.iloc[i]
|
|
|
|
# Calculate metrics
|
|
price_range = (future['high'].max() - future['low'].min()) / current['close']
|
|
volume_avg = future['volume'].mean()
|
|
volume_std = future['volume'].std()
|
|
price_end = future['close'].iloc[-1]
|
|
price_start = current['close']
|
|
|
|
# Accumulation criteria
|
|
if price_range < 0.02: # Tight range (< 2%)
|
|
volume_declining = future['volume'].iloc[-5:].mean() < future['volume'].iloc[:5].mean()
|
|
if volume_declining and price_end > price_start:
|
|
return 1 # accumulation
|
|
|
|
# Manipulation criteria
|
|
false_breaks = count_false_breakouts(future)
|
|
whipsaws = count_whipsaws(future)
|
|
if false_breaks >= 2 or whipsaws >= 3:
|
|
return 2 # manipulation
|
|
|
|
# Distribution criteria
|
|
if price_end < price_start * 0.98: # Decline >= 2%
|
|
volume_on_down = check_volume_on_down_moves(future)
|
|
lower_highs = count_lower_highs(future)
|
|
if volume_on_down and lower_highs >= 2:
|
|
return 3 # distribution
|
|
|
|
return 0 # neutral
|
|
```
|
|
|
|
### Output
|
|
|
|
```python
|
|
@dataclass
|
|
class AMDPrediction:
|
|
phase: str # 'neutral', 'accumulation', etc.
|
|
confidence: float # 0-1
|
|
probabilities: Dict[str, float] # {'neutral': 0.1, 'accumulation': 0.7, ...}
|
|
strength: float # 0-1
|
|
characteristics: Dict # Phase-specific metrics
|
|
timestamp: pd.Timestamp
|
|
|
|
# Ejemplo
|
|
prediction = amd_detector.predict(current_data)
|
|
# {
|
|
# 'phase': 'accumulation',
|
|
# 'confidence': 0.78,
|
|
# 'probabilities': {
|
|
# 'neutral': 0.05,
|
|
# 'accumulation': 0.78,
|
|
# 'manipulation': 0.12,
|
|
# 'distribution': 0.05
|
|
# },
|
|
# 'strength': 0.71,
|
|
# 'timestamp': '2025-12-05 14:30:00'
|
|
# }
|
|
```
|
|
|
|
### M\u00e9tricas de Evaluaci\u00f3n
|
|
|
|
| M\u00e9trica | Target | Actual |
|
|
|---------|--------|--------|
|
|
| **Overall Accuracy** | >70% | - |
|
|
| **Accumulation Precision** | >65% | - |
|
|
| **Manipulation Precision** | >60% | - |
|
|
| **Distribution Precision** | >65% | - |
|
|
| **Macro F1 Score** | >0.65 | - |
|
|
| **Weighted F1 Score** | >0.70 | - |
|
|
|
|
---
|
|
|
|
## Modelo 2: RangePredictor
|
|
|
|
### Descripci\u00f3n
|
|
|
|
Modelo de regresi\u00f3n que predice delta_high y delta_low para m\u00faltiples horizontes temporales.
|
|
|
|
**Ver implementaci\u00f3n existente:** `[LEGACY: apps/ml-engine - migrado desde TradingAgent]/src/models/range_predictor.py`
|
|
|
|
### Arquitectura
|
|
|
|
**Tipo:** XGBoost Regressor + Classifier (para bins)
|
|
**Horizontes:** 15m (3 bars), 1h (12 bars), personalizado
|
|
|
|
```python
|
|
class RangePredictor:
|
|
"""
|
|
Predice rangos de precio futuros
|
|
"""
|
|
|
|
def __init__(self, config=None):
|
|
self.config = config or self._default_config()
|
|
self.horizons = ['15m', '1h']
|
|
self.models = {}
|
|
|
|
# Initialize models for each horizon
|
|
for horizon in self.horizons:
|
|
self.models[f'{horizon}_high_reg'] = XGBRegressor(**self.config['xgboost'])
|
|
self.models[f'{horizon}_low_reg'] = XGBRegressor(**self.config['xgboost'])
|
|
self.models[f'{horizon}_high_bin'] = XGBClassifier(**self.config['xgboost_classifier'])
|
|
self.models[f'{horizon}_low_bin'] = XGBClassifier(**self.config['xgboost_classifier'])
|
|
```
|
|
|
|
### Input Features
|
|
|
|
**Dimensi\u00f3n:** 70+ features (base + AMD)
|
|
|
|
```python
|
|
def prepare_range_features(df, amd_features):
|
|
"""
|
|
Combina features base con outputs de AMDDetector
|
|
"""
|
|
# Base technical features (21 existentes)
|
|
base_features = extract_technical_features(df)
|
|
|
|
# AMD features (del AMDDetector)
|
|
amd_enhanced = {
|
|
'phase_encoded': encode_phase(amd_features['phase']),
|
|
'phase_accumulation_prob': amd_features['probabilities']['accumulation'],
|
|
'phase_manipulation_prob': amd_features['probabilities']['manipulation'],
|
|
'phase_distribution_prob': amd_features['probabilities']['distribution'],
|
|
'phase_strength': amd_features['strength'],
|
|
'range_compression': amd_features['characteristics'].get('range_compression', 0),
|
|
'order_blocks_net': (
|
|
amd_features['characteristics'].get('order_blocks_bullish', 0) -
|
|
amd_features['characteristics'].get('order_blocks_bearish', 0)
|
|
)
|
|
}
|
|
|
|
# Liquidity features (del LiquidityHunter)
|
|
liquidity_features = {
|
|
'bsl_distance': calculate_bsl_distance(df),
|
|
'ssl_distance': calculate_ssl_distance(df),
|
|
'liquidity_grab_recent': count_recent_liquidity_grabs(df),
|
|
'fvg_count': count_unfilled_fvg(df)
|
|
}
|
|
|
|
# ICT features
|
|
ict_features = {
|
|
'ote_position': calculate_ote_position(df),
|
|
'in_premium_zone': 1 if is_premium_zone(df) else 0,
|
|
'in_discount_zone': 1 if is_discount_zone(df) else 0,
|
|
'killzone_strength': get_killzone_strength(df),
|
|
'weekly_range_position': calculate_weekly_position(df),
|
|
'daily_range_position': calculate_daily_position(df)
|
|
}
|
|
|
|
# SMC features
|
|
smc_features = {
|
|
'choch_bullish_count': count_choch(df, 'bullish'),
|
|
'choch_bearish_count': count_choch(df, 'bearish'),
|
|
'bos_bullish_count': count_bos(df, 'bullish'),
|
|
'bos_bearish_count': count_bos(df, 'bearish'),
|
|
'displacement_strength': calculate_displacement(df),
|
|
'market_structure_score': calculate_structure_score(df)
|
|
}
|
|
|
|
# Combine all
|
|
return pd.DataFrame({
|
|
**base_features,
|
|
**amd_enhanced,
|
|
**liquidity_features,
|
|
**ict_features,
|
|
**smc_features
|
|
})
|
|
```
|
|
|
|
### Targets
|
|
|
|
```python
|
|
def calculate_range_targets(df, horizons={'15m': 3, '1h': 12}):
|
|
"""
|
|
Calcula targets de rango para entrenamiento
|
|
"""
|
|
targets = {}
|
|
|
|
for name, periods in horizons.items():
|
|
# Delta high/low
|
|
targets[f'delta_high_{name}'] = (
|
|
df['high'].rolling(periods).max().shift(-periods) - df['close']
|
|
) / df['close']
|
|
|
|
targets[f'delta_low_{name}'] = (
|
|
df['close'] - df['low'].rolling(periods).min().shift(-periods)
|
|
) / df['close']
|
|
|
|
# Bins (clasificaci\u00f3n de volatilidad)
|
|
atr = calculate_atr(df, 14)
|
|
|
|
def to_bin(delta):
|
|
if pd.isna(delta):
|
|
return np.nan
|
|
ratio = delta / atr
|
|
if ratio < 0.3:
|
|
return 0 # Very low
|
|
elif ratio < 0.7:
|
|
return 1 # Low
|
|
elif ratio < 1.2:
|
|
return 2 # Medium
|
|
else:
|
|
return 3 # High
|
|
|
|
targets[f'bin_high_{name}'] = targets[f'delta_high_{name}'].apply(to_bin)
|
|
targets[f'bin_low_{name}'] = targets[f'delta_low_{name}'].apply(to_bin)
|
|
|
|
return pd.DataFrame(targets)
|
|
```
|
|
|
|
### Output
|
|
|
|
```python
|
|
@dataclass
|
|
class RangePrediction:
|
|
horizon: str
|
|
delta_high: float # Predicted max price increase
|
|
delta_low: float # Predicted max price decrease
|
|
delta_high_bin: int # Volatility classification
|
|
delta_low_bin: int
|
|
confidence_high: float
|
|
confidence_low: float
|
|
predicted_high_price: float # Absolute price
|
|
predicted_low_price: float
|
|
timestamp: pd.Timestamp
|
|
|
|
# Ejemplo
|
|
predictions = range_predictor.predict(features, current_price=89350)
|
|
# [
|
|
# RangePrediction(
|
|
# horizon='15m',
|
|
# delta_high=0.0085, # +0.85%
|
|
# delta_low=0.0042, # -0.42%
|
|
# predicted_high_price=89,109,
|
|
# predicted_low_price=88,975,
|
|
# confidence_high=0.72,
|
|
# confidence_low=0.68
|
|
# ),
|
|
# RangePrediction(horizon='1h', ...)
|
|
# ]
|
|
```
|
|
|
|
### M\u00e9tricas de Evaluaci\u00f3n
|
|
|
|
| Horizonte | MAE High | MAE Low | MAPE | Bin Accuracy | R² |
|
|
|-----------|----------|---------|------|--------------|-----|
|
|
| **15m** | <0.003 | <0.003 | <0.5% | >65% | >0.3 |
|
|
| **1h** | <0.005 | <0.005 | <0.8% | >60% | >0.2 |
|
|
|
|
**Directional Accuracy:**
|
|
- High predictions: Target >95%
|
|
- Low predictions: Target >50% (mejorar desde 4-19%)
|
|
|
|
---
|
|
|
|
## Modelo 3: TPSLClassifier
|
|
|
|
### Descripci\u00f3n
|
|
|
|
Clasificador binario que predice la probabilidad de que Take Profit sea alcanzado antes que Stop Loss.
|
|
|
|
**Ver implementaci\u00f3n existente:** `[LEGACY: apps/ml-engine - migrado desde TradingAgent]/src/models/tp_sl_classifier.py`
|
|
|
|
### Arquitectura
|
|
|
|
**Tipo:** XGBoost Binary Classifier con calibraci\u00f3n
|
|
**R:R Configs:** M\u00faltiples ratios (2:1, 3:1, personalizado)
|
|
|
|
```python
|
|
class TPSLClassifier:
|
|
"""
|
|
Predice probabilidad TP antes de SL
|
|
"""
|
|
|
|
def __init__(self, config=None):
|
|
self.config = config or self._default_config()
|
|
self.horizons = ['15m', '1h']
|
|
self.rr_configs = [
|
|
{'name': 'rr_2_1', 'sl_atr_multiple': 0.3, 'tp_atr_multiple': 0.6},
|
|
{'name': 'rr_3_1', 'sl_atr_multiple': 0.3, 'tp_atr_multiple': 0.9},
|
|
]
|
|
self.models = {}
|
|
self.calibrated_models = {}
|
|
|
|
# Initialize models
|
|
for horizon in self.horizons:
|
|
for rr in self.rr_configs:
|
|
key = f'{horizon}_{rr["name"]}'
|
|
self.models[key] = XGBClassifier(**self.config['xgboost'])
|
|
```
|
|
|
|
### Input Features
|
|
|
|
**Dimensi\u00f3n:** 80+ features (base + AMD + Range predictions)
|
|
|
|
```python
|
|
def prepare_tpsl_features(df, amd_features, range_predictions):
|
|
"""
|
|
Features para TPSLClassifier incluyen stacking
|
|
"""
|
|
# Base + AMD features (igual que RangePredictor)
|
|
base_features = prepare_range_features(df, amd_features)
|
|
|
|
# Range predictions como features (stacking)
|
|
range_stacking = {
|
|
'pred_delta_high_15m': range_predictions['15m'].delta_high,
|
|
'pred_delta_low_15m': range_predictions['15m'].delta_low,
|
|
'pred_delta_high_1h': range_predictions['1h'].delta_high,
|
|
'pred_delta_low_1h': range_predictions['1h'].delta_low,
|
|
'pred_high_confidence': range_predictions['15m'].confidence_high,
|
|
'pred_low_confidence': range_predictions['15m'].confidence_low,
|
|
'pred_high_low_ratio': (
|
|
range_predictions['15m'].delta_high /
|
|
(range_predictions['15m'].delta_low + 1e-8)
|
|
)
|
|
}
|
|
|
|
# R:R specific features
|
|
rr_features = {
|
|
'atr_current': calculate_atr(df, 14).iloc[-1],
|
|
'volatility_regime': classify_volatility_regime(df),
|
|
'trend_alignment': check_trend_alignment(df, amd_features),
|
|
'liquidity_risk': calculate_liquidity_risk(df),
|
|
'manipulation_risk': amd_features['probabilities']['manipulation']
|
|
}
|
|
|
|
return pd.DataFrame({
|
|
**base_features,
|
|
**range_stacking,
|
|
**rr_features
|
|
})
|
|
```
|
|
|
|
### Targets
|
|
|
|
```python
|
|
def calculate_tpsl_targets(df, horizons, rr_configs):
|
|
"""
|
|
Calcula si TP toca antes de SL
|
|
"""
|
|
targets = {}
|
|
atr = calculate_atr(df, 14)
|
|
|
|
for horizon_name, periods in horizons.items():
|
|
for rr in rr_configs:
|
|
sl_distance = atr * rr['sl_atr_multiple']
|
|
tp_distance = atr * rr['tp_atr_multiple']
|
|
|
|
target_name = f'tp_first_{horizon_name}_{rr["name"]}'
|
|
|
|
def check_tp_first(i):
|
|
if i + periods >= len(df):
|
|
return np.nan
|
|
|
|
entry = df['close'].iloc[i]
|
|
sl_price = entry - sl_distance.iloc[i]
|
|
tp_price = entry + tp_distance.iloc[i]
|
|
|
|
future = df.iloc[i+1:i+periods+1]
|
|
|
|
# Check which hits first
|
|
for j, row in future.iterrows():
|
|
if row['low'] <= sl_price:
|
|
return 0 # SL hit first
|
|
elif row['high'] >= tp_price:
|
|
return 1 # TP hit first
|
|
|
|
return np.nan # Neither hit
|
|
|
|
targets[target_name] = [check_tp_first(i) for i in range(len(df))]
|
|
|
|
return pd.DataFrame(targets)
|
|
```
|
|
|
|
### Probability Calibration
|
|
|
|
```python
|
|
from sklearn.calibration import CalibratedClassifierCV
|
|
|
|
def calibrate_model(model, X_val, y_val):
|
|
"""
|
|
Calibra probabilidades usando isotonic regression
|
|
"""
|
|
calibrated = CalibratedClassifierCV(
|
|
model,
|
|
method='isotonic', # or 'sigmoid'
|
|
cv='prefit'
|
|
)
|
|
calibrated.fit(X_val, y_val)
|
|
return calibrated
|
|
|
|
# Uso
|
|
tpsl_classifier.models['15m_rr_2_1'].fit(X_train, y_train)
|
|
tpsl_classifier.calibrated_models['15m_rr_2_1'] = calibrate_model(
|
|
tpsl_classifier.models['15m_rr_2_1'],
|
|
X_val, y_val
|
|
)
|
|
```
|
|
|
|
### Output
|
|
|
|
```python
|
|
@dataclass
|
|
class TPSLPrediction:
|
|
horizon: str
|
|
rr_config: str
|
|
prob_tp_first: float # P(TP antes de SL)
|
|
prob_sl_first: float # 1 - prob_tp_first
|
|
recommended_action: str # 'long', 'short', 'hold'
|
|
confidence: float # |prob - 0.5| * 2
|
|
entry_price: float
|
|
sl_price: float
|
|
tp_price: float
|
|
expected_value: float # EV calculation
|
|
timestamp: pd.Timestamp
|
|
|
|
# Ejemplo
|
|
predictions = tpsl_classifier.predict(
|
|
features,
|
|
current_price=89350,
|
|
direction='long'
|
|
)
|
|
# [
|
|
# TPSLPrediction(
|
|
# horizon='15m',
|
|
# rr_config='rr_2_1',
|
|
# prob_tp_first=0.68,
|
|
# recommended_action='long',
|
|
# confidence=0.36,
|
|
# entry_price=89350,
|
|
# sl_price=89082, # -0.3 ATR
|
|
# tp_price=89886, # +0.6 ATR
|
|
# expected_value=0.136 # +13.6% EV
|
|
# )
|
|
# ]
|
|
```
|
|
|
|
### M\u00e9tricas de Evaluaci\u00f3n
|
|
|
|
| M\u00e9trica | Target | Actual (Phase 2) |
|
|
|---------|--------|------------------|
|
|
| **Accuracy** | >80% | 85.9% |
|
|
| **Precision** | >75% | 82.1% |
|
|
| **Recall** | >75% | 85.7% |
|
|
| **F1 Score** | >0.75 | 0.84 |
|
|
| **ROC-AUC** | >0.85 | 0.94 |
|
|
|
|
---
|
|
|
|
## Modelo 4: LiquidityHunter
|
|
|
|
### Descripci\u00f3n
|
|
|
|
Modelo especializado en detectar zonas de liquidez y predecir movimientos de "stop hunting".
|
|
|
|
### Arquitectura
|
|
|
|
**Tipo:** XGBoost Binary Classifier
|
|
**Output:** Probabilidad de liquidity sweep
|
|
|
|
```python
|
|
class LiquidityHunter:
|
|
"""
|
|
Detecta y predice caza de stops
|
|
"""
|
|
|
|
def __init__(self, config=None):
|
|
self.config = config or self._default_config()
|
|
self.model_bsl = XGBClassifier(**self.config['xgboost']) # Buy-side liquidity
|
|
self.model_ssl = XGBClassifier(**self.config['xgboost']) # Sell-side liquidity
|
|
self.scaler = StandardScaler()
|
|
|
|
def _default_config(self):
|
|
return {
|
|
'lookback_swing': 20, # Periodos para swing points
|
|
'sweep_threshold': 0.005, # 0.5% beyond level
|
|
'xgboost': {
|
|
'n_estimators': 200,
|
|
'max_depth': 5,
|
|
'learning_rate': 0.05,
|
|
'scale_pos_weight': 2.0, # Liquidity sweeps son raros
|
|
'objective': 'binary:logistic',
|
|
'eval_metric': 'auc'
|
|
}
|
|
}
|
|
```
|
|
|
|
### Input Features
|
|
|
|
**Dimensi\u00f3n:** 30 features especializados
|
|
|
|
```python
|
|
def extract_liquidity_features(df, lookback=20):
|
|
"""
|
|
Features para detecci\u00f3n de liquidez
|
|
"""
|
|
features = {}
|
|
|
|
# Identify liquidity pools
|
|
swing_highs = df['high'].rolling(lookback, center=True).max()
|
|
swing_lows = df['low'].rolling(lookback, center=True).min()
|
|
|
|
# Distance to liquidity
|
|
features['bsl_distance'] = (swing_highs - df['close']) / df['close']
|
|
features['ssl_distance'] = (df['close'] - swing_lows) / df['close']
|
|
|
|
# Liquidity density (how many levels nearby)
|
|
features['bsl_density'] = count_levels_above(df, lookback)
|
|
features['ssl_density'] = count_levels_below(df, lookback)
|
|
|
|
# Recent sweep history
|
|
features['bsl_sweeps_recent'] = count_bsl_sweeps(df, window=50)
|
|
features['ssl_sweeps_recent'] = count_ssl_sweeps(df, window=50)
|
|
|
|
# Volume profile near liquidity
|
|
features['volume_at_bsl'] = calculate_volume_at_level(df, swing_highs)
|
|
features['volume_at_ssl'] = calculate_volume_at_level(df, swing_lows)
|
|
|
|
# Market structure
|
|
features['higher_highs_forming'] = (df['high'] > df['high'].shift(1)).rolling(10).sum()
|
|
features['lower_lows_forming'] = (df['low'] < df['low'].shift(1)).rolling(10).sum()
|
|
|
|
# Volatility expansion (often precedes sweeps)
|
|
atr = calculate_atr(df, 14)
|
|
features['atr_expanding'] = (atr > atr.shift(5)).astype(int)
|
|
features['volatility_regime'] = classify_volatility(df)
|
|
|
|
# Price proximity to levels
|
|
features['near_bsl'] = (features['bsl_distance'] < 0.01).astype(int) # Within 1%
|
|
features['near_ssl'] = (features['ssl_distance'] < 0.01).astype(int)
|
|
|
|
# Time since last sweep
|
|
features['bars_since_bsl_sweep'] = calculate_bars_since_sweep(df, 'bsl')
|
|
features['bars_since_ssl_sweep'] = calculate_bars_since_sweep(df, 'ssl')
|
|
|
|
# Manipulation signals
|
|
features['false_breakouts_recent'] = count_false_breakouts(df, window=30)
|
|
features['whipsaw_intensity'] = calculate_whipsaw_intensity(df)
|
|
|
|
# AMD phase context
|
|
features['in_manipulation_phase'] = check_manipulation_phase(df)
|
|
|
|
return pd.DataFrame(features)
|
|
```
|
|
|
|
### Targets
|
|
|
|
```python
|
|
def label_liquidity_sweep(df, i, forward_window=10, sweep_threshold=0.005):
|
|
"""
|
|
Etiqueta si habr\u00e1 liquidity sweep
|
|
"""
|
|
if i + forward_window >= len(df):
|
|
return np.nan
|
|
|
|
current_high = df['high'].iloc[max(0, i-20):i].max()
|
|
current_low = df['low'].iloc[max(0, i-20):i].min()
|
|
|
|
future = df.iloc[i:i+forward_window]
|
|
|
|
# BSL sweep (sweep of highs)
|
|
bsl_sweep_price = current_high * (1 + sweep_threshold)
|
|
bsl_swept = (future['high'] >= bsl_sweep_price).any()
|
|
|
|
# SSL sweep (sweep of lows)
|
|
ssl_sweep_price = current_low * (1 - sweep_threshold)
|
|
ssl_swept = (future['low'] <= ssl_sweep_price).any()
|
|
|
|
# Return binary targets
|
|
return {
|
|
'bsl_sweep': 1 if bsl_swept else 0,
|
|
'ssl_sweep': 1 if ssl_swept else 0,
|
|
'any_sweep': 1 if (bsl_swept or ssl_swept) else 0
|
|
}
|
|
```
|
|
|
|
### Output
|
|
|
|
```python
|
|
@dataclass
|
|
class LiquidityPrediction:
|
|
liquidity_type: str # 'BSL' or 'SSL'
|
|
sweep_probability: float # 0-1
|
|
liquidity_level: float # Price level
|
|
distance_pct: float # Distance to level
|
|
density: int # Number of levels nearby
|
|
expected_timing: int # Bars until sweep
|
|
risk_score: float # Higher = more likely to be trapped
|
|
timestamp: pd.Timestamp
|
|
|
|
# Ejemplo
|
|
prediction = liquidity_hunter.predict(current_data)
|
|
# [
|
|
# LiquidityPrediction(
|
|
# liquidity_type='BSL',
|
|
# sweep_probability=0.72,
|
|
# liquidity_level=89450,
|
|
# distance_pct=0.0011, # 0.11% away
|
|
# density=3,
|
|
# expected_timing=5, # ~5 bars
|
|
# risk_score=0.68 # High risk of reversal after sweep
|
|
# )
|
|
# ]
|
|
```
|
|
|
|
### M\u00e9tricas
|
|
|
|
| M\u00e9trica | Target |
|
|
|---------|--------|
|
|
| **Precision** | >70% |
|
|
| **Recall** | >60% |
|
|
| **ROC-AUC** | >0.75 |
|
|
| **False Positive Rate** | <30% |
|
|
|
|
---
|
|
|
|
## Modelo 5: OrderFlowAnalyzer
|
|
|
|
### Descripci\u00f3n
|
|
|
|
Analiza el flujo de \u00f3rdenes para detectar acumulaci\u00f3n/distribuci\u00f3n institucional.
|
|
|
|
**Nota:** Modelo opcional - requiere datos de volumen granular
|
|
|
|
### Arquitectura
|
|
|
|
**Tipo:** LSTM / Transformer (para secuencias temporales)
|
|
**Output:** Score de acumulaci\u00f3n/distribuci\u00f3n
|
|
|
|
```python
|
|
import torch
|
|
import torch.nn as nn
|
|
|
|
class OrderFlowAnalyzer(nn.Module):
|
|
"""
|
|
Analiza flujo de \u00f3rdenes usando LSTM
|
|
"""
|
|
|
|
def __init__(self, input_dim=10, hidden_dim=64, num_layers=2):
|
|
super().__init__()
|
|
self.lstm = nn.LSTM(
|
|
input_dim,
|
|
hidden_dim,
|
|
num_layers,
|
|
batch_first=True,
|
|
dropout=0.2
|
|
)
|
|
self.fc = nn.Sequential(
|
|
nn.Linear(hidden_dim, 32),
|
|
nn.ReLU(),
|
|
nn.Dropout(0.3),
|
|
nn.Linear(32, 3) # accumulation, neutral, distribution
|
|
)
|
|
|
|
def forward(self, x):
|
|
# x shape: (batch, sequence, features)
|
|
lstm_out, _ = self.lstm(x)
|
|
# Take last output
|
|
last_out = lstm_out[:, -1, :]
|
|
output = self.fc(last_out)
|
|
return torch.softmax(output, dim=1)
|
|
```
|
|
|
|
### Input Features (Secuencia)
|
|
|
|
**Dimensi\u00f3n:** 10 features x 50 timesteps
|
|
|
|
```python
|
|
def extract_order_flow_sequence(df, sequence_length=50):
|
|
"""
|
|
Extrae secuencia de order flow features
|
|
"""
|
|
features = []
|
|
|
|
for i in range(len(df) - sequence_length + 1):
|
|
window = df.iloc[i:i+sequence_length]
|
|
|
|
sequence_features = {
|
|
# Delta de volumen
|
|
'volume_delta': window['volume'] - window['volume'].shift(1),
|
|
|
|
# Buy/Sell imbalance
|
|
'buy_volume': window['volume'] * (window['close'] > window['open']).astype(int),
|
|
'sell_volume': window['volume'] * (window['close'] < window['open']).astype(int),
|
|
'imbalance': (window['buy_volume'] - window['sell_volume']) / window['volume'],
|
|
|
|
# Large orders detection
|
|
'large_orders': (window['volume'] > window['volume'].rolling(20).mean() * 2).astype(int),
|
|
|
|
# Tick data (si disponible)
|
|
'upticks': count_upticks(window),
|
|
'downticks': count_downticks(window),
|
|
'tick_imbalance': (window['upticks'] - window['downticks']) / (window['upticks'] + window['downticks'] + 1),
|
|
|
|
# Cumulative metrics
|
|
'cumulative_delta': (window['buy_volume'] - window['sell_volume']).cumsum(),
|
|
'cvd_slope': window['cumulative_delta'].diff(5) / 5
|
|
}
|
|
|
|
features.append(pd.DataFrame(sequence_features))
|
|
|
|
return np.array([f.values for f in features])
|
|
```
|
|
|
|
### Output
|
|
|
|
```python
|
|
@dataclass
|
|
class OrderFlowPrediction:
|
|
flow_type: str # 'accumulation', 'distribution', 'neutral'
|
|
confidence: float
|
|
imbalance_score: float # -1 (selling) to +1 (buying)
|
|
institutional_activity: float # 0-1
|
|
large_orders_detected: int
|
|
cvd_trend: str # 'up', 'down', 'flat'
|
|
timestamp: pd.Timestamp
|
|
```
|
|
|
|
---
|
|
|
|
## Meta-Modelo: StrategyOrchestrator
|
|
|
|
### Descripci\u00f3n
|
|
|
|
Combina todos los modelos anteriores para generar la se\u00f1al final de trading.
|
|
|
|
### Arquitectura
|
|
|
|
**Tipo:** Ensemble Weighted + Rule-Based
|
|
|
|
```python
|
|
class StrategyOrchestrator:
|
|
"""
|
|
Meta-modelo que orquesta todas las predicciones
|
|
"""
|
|
|
|
def __init__(self, models, config=None):
|
|
self.amd_detector = models['amd_detector']
|
|
self.range_predictor = models['range_predictor']
|
|
self.tpsl_classifier = models['tpsl_classifier']
|
|
self.liquidity_hunter = models['liquidity_hunter']
|
|
self.order_flow_analyzer = models.get('order_flow_analyzer')
|
|
|
|
self.config = config or self._default_config()
|
|
self.weights = self.config['weights']
|
|
|
|
def _default_config(self):
|
|
return {
|
|
'weights': {
|
|
'amd': 0.30,
|
|
'range': 0.25,
|
|
'tpsl': 0.25,
|
|
'liquidity': 0.15,
|
|
'order_flow': 0.05
|
|
},
|
|
'min_confidence': 0.60,
|
|
'min_tp_probability': 0.55,
|
|
'risk_multiplier': 0.02 # 2% risk per trade
|
|
}
|
|
|
|
def generate_signal(self, market_data, current_price):
|
|
"""
|
|
Genera se\u00f1al de trading combinando todos los modelos
|
|
"""
|
|
signal = {
|
|
'action': 'hold',
|
|
'confidence': 0.0,
|
|
'entry_price': current_price,
|
|
'stop_loss': None,
|
|
'take_profit': None,
|
|
'position_size': 0.0,
|
|
'reasoning': [],
|
|
'model_outputs': {}
|
|
}
|
|
|
|
# 1. AMD Phase
|
|
amd_pred = self.amd_detector.predict(market_data)
|
|
signal['model_outputs']['amd'] = amd_pred
|
|
|
|
if amd_pred['confidence'] < 0.6:
|
|
signal['reasoning'].append('Low AMD confidence - avoiding trade')
|
|
return signal
|
|
|
|
# 2. Range Prediction
|
|
range_pred = self.range_predictor.predict(market_data, current_price)
|
|
signal['model_outputs']['range'] = range_pred
|
|
|
|
# 3. TPSL Probability
|
|
tpsl_pred = self.tpsl_classifier.predict(market_data, current_price)
|
|
signal['model_outputs']['tpsl'] = tpsl_pred
|
|
|
|
# 4. Liquidity Analysis
|
|
liq_pred = self.liquidity_hunter.predict(market_data)
|
|
signal['model_outputs']['liquidity'] = liq_pred
|
|
|
|
# 5. Order Flow (if available)
|
|
if self.order_flow_analyzer:
|
|
flow_pred = self.order_flow_analyzer.predict(market_data)
|
|
signal['model_outputs']['order_flow'] = flow_pred
|
|
|
|
# === DECISION LOGIC ===
|
|
|
|
# Determine bias from AMD
|
|
if amd_pred['phase'] == 'accumulation':
|
|
bias = 'bullish'
|
|
signal['reasoning'].append(f'AMD: Accumulation phase (conf: {amd_pred["confidence"]:.2%})')
|
|
elif amd_pred['phase'] == 'distribution':
|
|
bias = 'bearish'
|
|
signal['reasoning'].append(f'AMD: Distribution phase (conf: {amd_pred["confidence"]:.2%})')
|
|
elif amd_pred['phase'] == 'manipulation':
|
|
signal['reasoning'].append('AMD: Manipulation phase - avoiding entry')
|
|
return signal
|
|
else:
|
|
signal['reasoning'].append('AMD: Neutral phase - no clear direction')
|
|
return signal
|
|
|
|
# Check range prediction alignment
|
|
if bias == 'bullish':
|
|
range_alignment = range_pred['15m'].delta_high > range_pred['15m'].delta_low * 1.5
|
|
else:
|
|
range_alignment = range_pred['15m'].delta_low > range_pred['15m'].delta_high * 1.5
|
|
|
|
if not range_alignment:
|
|
signal['reasoning'].append('Range prediction does not align with bias')
|
|
return signal
|
|
|
|
signal['reasoning'].append('Range prediction aligned')
|
|
|
|
# Check TPSL probability
|
|
relevant_tpsl = [p for p in tpsl_pred if p.recommended_action == bias.replace('ish', '')]
|
|
if not relevant_tpsl or relevant_tpsl[0].prob_tp_first < self.config['min_tp_probability']:
|
|
signal['reasoning'].append(f'Low TP probability: {relevant_tpsl[0].prob_tp_first:.2%}')
|
|
return signal
|
|
|
|
signal['reasoning'].append(f'High TP probability: {relevant_tpsl[0].prob_tp_first:.2%}')
|
|
|
|
# Check liquidity risk
|
|
if liq_pred:
|
|
liquidity_risk = any(p.sweep_probability > 0.7 and p.distance_pct < 0.005 for p in liq_pred)
|
|
if liquidity_risk:
|
|
signal['reasoning'].append('High liquidity sweep risk nearby')
|
|
# Reduce position size
|
|
position_multiplier = 0.5
|
|
else:
|
|
position_multiplier = 1.0
|
|
else:
|
|
position_multiplier = 1.0
|
|
|
|
# === CALCULATE CONFIDENCE ===
|
|
confidence_score = 0.0
|
|
|
|
# AMD contribution
|
|
confidence_score += self.weights['amd'] * amd_pred['confidence']
|
|
|
|
# Range contribution
|
|
range_conf = (range_pred['15m'].confidence_high + range_pred['15m'].confidence_low) / 2
|
|
confidence_score += self.weights['range'] * range_conf
|
|
|
|
# TPSL contribution
|
|
tpsl_conf = relevant_tpsl[0].confidence
|
|
confidence_score += self.weights['tpsl'] * tpsl_conf
|
|
|
|
# Liquidity contribution
|
|
if liq_pred:
|
|
liq_conf = 1 - max(p.risk_score for p in liq_pred) # Inverse of risk
|
|
confidence_score += self.weights['liquidity'] * liq_conf
|
|
|
|
signal['confidence'] = confidence_score
|
|
|
|
if confidence_score < self.config['min_confidence']:
|
|
signal['reasoning'].append(f'Overall confidence too low: {confidence_score:.2%}')
|
|
return signal
|
|
|
|
# === GENERATE ENTRY ===
|
|
signal['action'] = 'long' if bias == 'bullish' else 'short'
|
|
signal['entry_price'] = current_price
|
|
|
|
# Use TPSL predictions
|
|
tpsl_entry = relevant_tpsl[0]
|
|
signal['stop_loss'] = tpsl_entry.sl_price
|
|
signal['take_profit'] = tpsl_entry.tp_price
|
|
|
|
# Calculate position size
|
|
account_risk = self.config['risk_multiplier'] # 2% of account
|
|
price_risk = abs(current_price - tpsl_entry.sl_price) / current_price
|
|
signal['position_size'] = (account_risk / price_risk) * position_multiplier
|
|
|
|
signal['reasoning'].append(f'Signal generated: {signal["action"].upper()}')
|
|
signal['reasoning'].append(f'Confidence: {confidence_score:.2%}')
|
|
signal['reasoning'].append(f'R:R: {(abs(tpsl_entry.tp_price - current_price) / abs(current_price - tpsl_entry.sl_price)):.2f}:1')
|
|
|
|
return signal
|
|
```
|
|
|
|
### Pipeline de Decisi\u00f3n
|
|
|
|
```
|
|
Market Data
|
|
│
|
|
▼
|
|
┌─────────────┐
|
|
│ AMDDetector │──── Phase = Accumulation? ──────┐
|
|
└─────────────┘ Confidence > 0.6? │
|
|
NO → HOLD
|
|
YES
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ RangePredictor │
|
|
└────────┬────────┘
|
|
│
|
|
ΔHigh > ΔLow * 1.5?
|
|
│
|
|
YES
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│TPSLClassifier │
|
|
└────────┬────────┘
|
|
│
|
|
P(TP first) > 0.55?
|
|
│
|
|
YES
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│LiquidityHunter │
|
|
└────────┬────────┘
|
|
│
|
|
Sweep risk low?
|
|
│
|
|
YES
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Confidence │
|
|
│ Calculation │
|
|
└────────┬────────┘
|
|
│
|
|
Total > 0.60?
|
|
│
|
|
YES
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ LONG SIGNAL │
|
|
│ Entry, SL, TP │
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Output
|
|
|
|
```python
|
|
@dataclass
|
|
class TradingSignal:
|
|
action: str # 'long', 'short', 'hold'
|
|
confidence: float # 0-1
|
|
entry_price: float
|
|
stop_loss: float
|
|
take_profit: float
|
|
position_size: float # Units or % of account
|
|
risk_reward_ratio: float
|
|
expected_value: float # EV calculation
|
|
reasoning: List[str] # Why this signal
|
|
model_outputs: Dict # All model predictions
|
|
timestamp: pd.Timestamp
|
|
|
|
# Metadata
|
|
symbol: str
|
|
horizon: str
|
|
amd_phase: str
|
|
killzone: str
|
|
|
|
# Ejemplo completo
|
|
signal = orchestrator.generate_signal(market_data, current_price=89350)
|
|
# TradingSignal(
|
|
# action='long',
|
|
# confidence=0.73,
|
|
# entry_price=89350,
|
|
# stop_loss=89082,
|
|
# take_profit=89886,
|
|
# position_size=0.15, # 15% of account
|
|
# risk_reward_ratio=2.0,
|
|
# expected_value=0.214, # +21.4% EV
|
|
# reasoning=[
|
|
# 'AMD: Accumulation phase (conf: 78%)',
|
|
# 'Range prediction aligned',
|
|
# 'High TP probability: 68%',
|
|
# 'Signal generated: LONG',
|
|
# 'Confidence: 73%',
|
|
# 'R:R: 2.00:1'
|
|
# ],
|
|
# amd_phase='accumulation',
|
|
# killzone='ny_am'
|
|
# )
|
|
```
|
|
|
|
---
|
|
|
|
## Pipeline de Entrenamiento
|
|
|
|
### Workflow Completo
|
|
|
|
```python
|
|
class MLTrainingPipeline:
|
|
"""
|
|
Pipeline completo de entrenamiento
|
|
"""
|
|
|
|
def __init__(self, data_path, config):
|
|
self.data_path = data_path
|
|
self.config = config
|
|
self.models = {}
|
|
|
|
def run(self):
|
|
"""Ejecuta pipeline completo"""
|
|
|
|
# 1. Load & prepare data
|
|
print("1. Loading data...")
|
|
df = self.load_data()
|
|
|
|
# 2. Feature engineering
|
|
print("2. Engineering features...")
|
|
features = self.engineer_features(df)
|
|
|
|
# 3. Target labeling
|
|
print("3. Labeling targets...")
|
|
targets = self.label_targets(df)
|
|
|
|
# 4. Train-test split (temporal)
|
|
print("4. Splitting data...")
|
|
X_train, X_val, X_test, y_train, y_val, y_test = self.temporal_split(
|
|
features, targets
|
|
)
|
|
|
|
# 5. Train AMDDetector
|
|
print("5. Training AMDDetector...")
|
|
self.models['amd_detector'] = self.train_amd_detector(
|
|
X_train, y_train['amd'], X_val, y_val['amd']
|
|
)
|
|
|
|
# 6. Generate AMD features for next models
|
|
print("6. Generating AMD features...")
|
|
amd_features_train = self.models['amd_detector'].predict_proba(X_train)
|
|
amd_features_val = self.models['amd_detector'].predict_proba(X_val)
|
|
|
|
# 7. Train RangePredictor
|
|
print("7. Training RangePredictor...")
|
|
X_range_train = np.hstack([X_train, amd_features_train])
|
|
X_range_val = np.hstack([X_val, amd_features_val])
|
|
|
|
self.models['range_predictor'] = self.train_range_predictor(
|
|
X_range_train, y_train['range'], X_range_val, y_val['range']
|
|
)
|
|
|
|
# 8. Generate range predictions for TPSL
|
|
print("8. Generating range predictions...")
|
|
range_preds_train = self.models['range_predictor'].predict(X_range_train)
|
|
range_preds_val = self.models['range_predictor'].predict(X_range_val)
|
|
|
|
# 9. Train TPSLClassifier
|
|
print("9. Training TPSLClassifier...")
|
|
X_tpsl_train = np.hstack([X_range_train, range_preds_train])
|
|
X_tpsl_val = np.hstack([X_range_val, range_preds_val])
|
|
|
|
self.models['tpsl_classifier'] = self.train_tpsl_classifier(
|
|
X_tpsl_train, y_train['tpsl'], X_tpsl_val, y_val['tpsl']
|
|
)
|
|
|
|
# 10. Train LiquidityHunter
|
|
print("10. Training LiquidityHunter...")
|
|
self.models['liquidity_hunter'] = self.train_liquidity_hunter(
|
|
X_train, y_train['liquidity'], X_val, y_val['liquidity']
|
|
)
|
|
|
|
# 11. Evaluate all models
|
|
print("11. Evaluating models...")
|
|
self.evaluate_all(X_test, y_test)
|
|
|
|
# 12. Save models
|
|
print("12. Saving models...")
|
|
self.save_all_models()
|
|
|
|
print("Training complete!")
|
|
return self.models
|
|
|
|
def temporal_split(self, features, targets, train_pct=0.7, val_pct=0.15):
|
|
"""Split temporal (sin shuffle)"""
|
|
n = len(features)
|
|
train_end = int(n * train_pct)
|
|
val_end = int(n * (train_pct + val_pct))
|
|
|
|
return (
|
|
features[:train_end],
|
|
features[train_end:val_end],
|
|
features[val_end:],
|
|
targets[:train_end],
|
|
targets[train_end:val_end],
|
|
targets[val_end:]
|
|
)
|
|
```
|
|
|
|
### Cross-Validation Temporal
|
|
|
|
```python
|
|
from sklearn.model_selection import TimeSeriesSplit
|
|
|
|
def temporal_cross_validation(model, X, y, n_splits=5):
|
|
"""
|
|
Cross-validation respetando orden temporal
|
|
"""
|
|
tscv = TimeSeriesSplit(n_splits=n_splits)
|
|
scores = []
|
|
|
|
for fold, (train_idx, val_idx) in enumerate(tscv.split(X)):
|
|
print(f"Fold {fold + 1}/{n_splits}")
|
|
|
|
X_train, X_val = X[train_idx], X[val_idx]
|
|
y_train, y_val = y[train_idx], y[val_idx]
|
|
|
|
# Train
|
|
model.fit(X_train, y_train)
|
|
|
|
# Evaluate
|
|
y_pred = model.predict(X_val)
|
|
score = accuracy_score(y_val, y_pred)
|
|
scores.append(score)
|
|
|
|
print(f" Accuracy: {score:.4f}")
|
|
|
|
print(f"\nMean Accuracy: {np.mean(scores):.4f} ± {np.std(scores):.4f}")
|
|
return scores
|
|
```
|
|
|
|
---
|
|
|
|
## M\u00e9tricas y Evaluaci\u00f3n
|
|
|
|
### M\u00e9tricas por Modelo
|
|
|
|
```python
|
|
class ModelEvaluator:
|
|
"""
|
|
Evaluaci\u00f3n completa de modelos
|
|
"""
|
|
|
|
@staticmethod
|
|
def evaluate_amd_detector(model, X_test, y_test):
|
|
"""Evaluar AMDDetector"""
|
|
y_pred = model.predict(X_test)
|
|
y_pred_proba = model.predict_proba(X_test)
|
|
|
|
metrics = {
|
|
'accuracy': accuracy_score(y_test, y_pred),
|
|
'macro_f1': f1_score(y_test, y_pred, average='macro'),
|
|
'weighted_f1': f1_score(y_test, y_pred, average='weighted'),
|
|
'classification_report': classification_report(y_test, y_pred),
|
|
'confusion_matrix': confusion_matrix(y_test, y_pred)
|
|
}
|
|
|
|
# Per-class metrics
|
|
for class_idx, class_name in model.label_encoder.items():
|
|
mask = y_test == class_idx
|
|
if mask.sum() > 0:
|
|
metrics[f'{class_name}_precision'] = precision_score(
|
|
y_test == class_idx, y_pred == class_idx
|
|
)
|
|
metrics[f'{class_name}_recall'] = recall_score(
|
|
y_test == class_idx, y_pred == class_idx
|
|
)
|
|
|
|
return metrics
|
|
|
|
@staticmethod
|
|
def evaluate_range_predictor(model, X_test, y_test):
|
|
"""Evaluar RangePredictor"""
|
|
predictions = model.predict(X_test)
|
|
|
|
metrics = {}
|
|
for horizon in ['15m', '1h']:
|
|
for target_type in ['high', 'low']:
|
|
y_true = y_test[f'delta_{target_type}_{horizon}']
|
|
y_pred = [p.delta_high if target_type == 'high' else p.delta_low
|
|
for p in predictions if p.horizon == horizon]
|
|
|
|
metrics[f'{horizon}_{target_type}_mae'] = mean_absolute_error(y_true, y_pred)
|
|
metrics[f'{horizon}_{target_type}_rmse'] = np.sqrt(mean_squared_error(y_true, y_pred))
|
|
metrics[f'{horizon}_{target_type}_r2'] = r2_score(y_true, y_pred)
|
|
|
|
# Directional accuracy
|
|
direction_true = np.sign(y_true)
|
|
direction_pred = np.sign(y_pred)
|
|
metrics[f'{horizon}_{target_type}_directional_acc'] = (
|
|
direction_true == direction_pred
|
|
).mean()
|
|
|
|
return metrics
|
|
|
|
@staticmethod
|
|
def evaluate_tpsl_classifier(model, X_test, y_test):
|
|
"""Evaluar TPSLClassifier"""
|
|
metrics = {}
|
|
|
|
for horizon in ['15m', '1h']:
|
|
for rr in ['rr_2_1', 'rr_3_1']:
|
|
target_key = f'tp_first_{horizon}_{rr}'
|
|
y_true = y_test[target_key].dropna()
|
|
|
|
if len(y_true) == 0:
|
|
continue
|
|
|
|
X_valid = X_test[y_test[target_key].notna()]
|
|
|
|
y_pred = model.predict_proba(X_valid, horizon, rr)
|
|
y_pred_class = (y_pred > 0.5).astype(int)
|
|
|
|
metrics[f'{horizon}_{rr}_accuracy'] = accuracy_score(y_true, y_pred_class)
|
|
metrics[f'{horizon}_{rr}_roc_auc'] = roc_auc_score(y_true, y_pred)
|
|
metrics[f'{horizon}_{rr}_precision'] = precision_score(y_true, y_pred_class)
|
|
metrics[f'{horizon}_{rr}_recall'] = recall_score(y_true, y_pred_class)
|
|
metrics[f'{horizon}_{rr}_f1'] = f1_score(y_true, y_pred_class)
|
|
|
|
return metrics
|
|
```
|
|
|
|
### Backtesting de Señales
|
|
|
|
```python
|
|
class SignalBacktester:
|
|
"""
|
|
Backtesting de se\u00f1ales generadas
|
|
"""
|
|
|
|
def __init__(self, initial_capital=10000):
|
|
self.initial_capital = initial_capital
|
|
self.capital = initial_capital
|
|
self.trades = []
|
|
self.equity_curve = []
|
|
|
|
def run(self, df, signals):
|
|
"""Ejecuta backtest"""
|
|
position = None
|
|
|
|
for i, signal in enumerate(signals):
|
|
if signal['action'] == 'hold':
|
|
continue
|
|
|
|
# Entry
|
|
if position is None and signal['action'] in ['long', 'short']:
|
|
position = {
|
|
'type': signal['action'],
|
|
'entry_price': signal['entry_price'],
|
|
'entry_time': signal['timestamp'],
|
|
'stop_loss': signal['stop_loss'],
|
|
'take_profit': signal['take_profit'],
|
|
'size': signal['position_size']
|
|
}
|
|
|
|
# Check exit
|
|
if position is not None:
|
|
# Simulate price movement
|
|
future_bars = df[df.index > signal['timestamp']].head(100)
|
|
|
|
for idx, row in future_bars.iterrows():
|
|
# Check SL
|
|
if position['type'] == 'long' and row['low'] <= position['stop_loss']:
|
|
self._close_position(position, position['stop_loss'], idx, 'SL')
|
|
position = None
|
|
break
|
|
|
|
# Check TP
|
|
elif position['type'] == 'long' and row['high'] >= position['take_profit']:
|
|
self._close_position(position, position['take_profit'], idx, 'TP')
|
|
position = None
|
|
break
|
|
|
|
self.equity_curve.append(self.capital)
|
|
|
|
return self._calculate_metrics()
|
|
|
|
def _close_position(self, position, exit_price, exit_time, exit_reason):
|
|
"""Cierra posici\u00f3n"""
|
|
if position['type'] == 'long':
|
|
pnl = (exit_price - position['entry_price']) / position['entry_price']
|
|
else:
|
|
pnl = (position['entry_price'] - exit_price) / position['entry_price']
|
|
|
|
pnl_amount = self.capital * position['size'] * pnl
|
|
self.capital += pnl_amount
|
|
|
|
self.trades.append({
|
|
'type': position['type'],
|
|
'entry_price': position['entry_price'],
|
|
'exit_price': exit_price,
|
|
'entry_time': position['entry_time'],
|
|
'exit_time': exit_time,
|
|
'exit_reason': exit_reason,
|
|
'pnl_pct': pnl * 100,
|
|
'pnl_amount': pnl_amount
|
|
})
|
|
|
|
def _calculate_metrics(self):
|
|
"""Calcula m\u00e9tricas de performance"""
|
|
if not self.trades:
|
|
return {}
|
|
|
|
trades_df = pd.DataFrame(self.trades)
|
|
|
|
total_return = (self.capital - self.initial_capital) / self.initial_capital
|
|
num_trades = len(trades_df)
|
|
num_wins = (trades_df['pnl_pct'] > 0).sum()
|
|
num_losses = (trades_df['pnl_pct'] < 0).sum()
|
|
win_rate = num_wins / num_trades if num_trades > 0 else 0
|
|
|
|
avg_win = trades_df[trades_df['pnl_pct'] > 0]['pnl_pct'].mean() if num_wins > 0 else 0
|
|
avg_loss = trades_df[trades_df['pnl_pct'] < 0]['pnl_pct'].mean() if num_losses > 0 else 0
|
|
|
|
# Sharpe ratio
|
|
returns = pd.Series(self.equity_curve).pct_change().dropna()
|
|
sharpe = np.sqrt(252) * (returns.mean() / returns.std()) if returns.std() > 0 else 0
|
|
|
|
# Max drawdown
|
|
equity_series = pd.Series(self.equity_curve)
|
|
cummax = equity_series.cummax()
|
|
drawdown = (equity_series - cummax) / cummax
|
|
max_drawdown = drawdown.min()
|
|
|
|
return {
|
|
'total_return_pct': total_return * 100,
|
|
'final_capital': self.capital,
|
|
'num_trades': num_trades,
|
|
'num_wins': num_wins,
|
|
'num_losses': num_losses,
|
|
'win_rate': win_rate * 100,
|
|
'avg_win_pct': avg_win,
|
|
'avg_loss_pct': avg_loss,
|
|
'profit_factor': abs(avg_win * num_wins / (avg_loss * num_losses)) if num_losses > 0 else np.inf,
|
|
'sharpe_ratio': sharpe,
|
|
'max_drawdown_pct': max_drawdown * 100
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Producci\u00f3n y Deployment
|
|
|
|
### FastAPI Service
|
|
|
|
```python
|
|
from fastapi import FastAPI, HTTPException
|
|
from pydantic import BaseModel
|
|
|
|
app = FastAPI(title="Trading Platform ML Service")
|
|
|
|
# Load models
|
|
orchestrator = StrategyOrchestrator.load('models/orchestrator_v1.pkl')
|
|
|
|
class PredictionRequest(BaseModel):
|
|
symbol: str
|
|
timeframe: str = '5m'
|
|
include_reasoning: bool = True
|
|
|
|
class PredictionResponse(BaseModel):
|
|
signal: TradingSignal
|
|
metadata: Dict
|
|
|
|
@app.post("/api/signal")
|
|
async def get_trading_signal(request: PredictionRequest):
|
|
"""
|
|
Genera se\u00f1al de trading
|
|
"""
|
|
try:
|
|
# Fetch market data
|
|
market_data = fetch_market_data(request.symbol, request.timeframe)
|
|
|
|
# Generate signal
|
|
signal = orchestrator.generate_signal(
|
|
market_data,
|
|
current_price=market_data['close'].iloc[-1]
|
|
)
|
|
|
|
return PredictionResponse(
|
|
signal=signal,
|
|
metadata={
|
|
'model_version': '1.0.0',
|
|
'latency_ms': 45,
|
|
'timestamp': datetime.now().isoformat()
|
|
}
|
|
)
|
|
|
|
except Exception as e:
|
|
raise HTTPException(status_code=500, detail=str(e))
|
|
|
|
@app.get("/api/health")
|
|
async def health_check():
|
|
return {
|
|
'status': 'healthy',
|
|
'models_loaded': True,
|
|
'version': '1.0.0'
|
|
}
|
|
```
|
|
|
|
### Monitoring
|
|
|
|
```python
|
|
import prometheus_client as prom
|
|
|
|
# Metrics
|
|
prediction_counter = prom.Counter('ml_predictions_total', 'Total predictions')
|
|
prediction_latency = prom.Histogram('ml_prediction_latency_seconds', 'Prediction latency')
|
|
model_accuracy = prom.Gauge('ml_model_accuracy', 'Model accuracy', ['model_name'])
|
|
|
|
@prediction_latency.time()
|
|
def generate_signal_monitored(data):
|
|
prediction_counter.inc()
|
|
signal = orchestrator.generate_signal(data)
|
|
return signal
|
|
```
|
|
|
|
### Retraining Pipeline
|
|
|
|
```python
|
|
class AutoRetrainingPipeline:
|
|
"""
|
|
Pipeline de reentrenamiento autom\u00e1tico
|
|
"""
|
|
|
|
def __init__(self, schedule='weekly'):
|
|
self.schedule = schedule
|
|
self.performance_threshold = 0.70
|
|
|
|
def should_retrain(self):
|
|
"""Determina si es necesario reentrenar"""
|
|
# Check recent performance
|
|
recent_accuracy = self.get_recent_accuracy()
|
|
|
|
if recent_accuracy < self.performance_threshold:
|
|
return True, 'Performance degradation'
|
|
|
|
# Check data drift
|
|
drift_detected = self.detect_data_drift()
|
|
if drift_detected:
|
|
return True, 'Data drift detected'
|
|
|
|
return False, None
|
|
|
|
def execute_retraining(self):
|
|
"""Ejecuta reentrenamiento"""
|
|
print("Starting retraining...")
|
|
|
|
# Fetch new data
|
|
new_data = self.fetch_latest_data()
|
|
|
|
# Retrain all models
|
|
pipeline = MLTrainingPipeline(new_data, self.config)
|
|
new_models = pipeline.run()
|
|
|
|
# Validate new models
|
|
if self.validate_new_models(new_models):
|
|
# Deploy new models
|
|
self.deploy_models(new_models)
|
|
print("Retraining complete. New models deployed.")
|
|
else:
|
|
print("Validation failed. Keeping old models.")
|
|
```
|
|
|
|
---
|
|
|
|
**Documento Generado:** 2025-12-05
|
|
**Pr\u00f3xima Revisi\u00f3n:** 2025-Q1
|
|
**Contacto:** ml-engineering@trading.ai
|