--- id: "ET-ML-012" title: "VBP (Volatility Breakout Predictor) Strategy" type: "Technical Specification" status: "Approved" priority: "Alta" epic: "OQI-006" project: "trading-platform" version: "1.0.0" created_date: "2026-01-25" updated_date: "2026-01-25" task_reference: "TASK-2026-01-25-ML-TRAINING-ENHANCEMENT" --- # ET-ML-012: VBP (Volatility Breakout Predictor) Strategy ## Metadata | Campo | Valor | |-------|-------| | **ID** | ET-ML-012 | | **Epica** | OQI-006 - Senales ML | | **Tipo** | Especificacion Tecnica | | **Version** | 1.0.0 | | **Estado** | Aprobado | | **Ultima actualizacion** | 2026-01-25 | | **Tarea Referencia** | TASK-2026-01-25-ML-TRAINING-ENHANCEMENT | --- ## Resumen La estrategia VBP (Volatility Breakout Predictor) predice rupturas de volatilidad inminentes usando una arquitectura de **CNN 1D** con **Attention** y **XGBoost**. El modelo se especializa en detectar compresiones de volatilidad que preceden a movimientos explosivos. ### Caracteristicas Clave - **CNN 1D + Attention**: Extrae patrones locales y relaciones globales - **Features de Volatilidad**: ATR, Bollinger Bands, Keltner Channels - **Deteccion de Squeeze**: Identifica compresiones de volatilidad - **Balanced Sampling**: 3x oversample de eventos de breakout --- ## Arquitectura ### Diagrama de Alto Nivel ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ VBP MODEL │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ Input: OHLCV Sequence (seq_len x n_features) │ │ └── Volatility features (ATR, BB, Keltner) │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ CNN 1D BACKBONE │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Conv1D │ │ Conv1D │ │ Conv1D │ │ │ │ │ │ 64 filters │──▶│ 128 filters │──▶│ 256 filters │ │ │ │ │ │ kernel=7 │ │ kernel=5 │ │ kernel=3 │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ │ │ │ └────────────────┼────────────────┘ │ │ │ │ ▼ │ │ │ │ ┌─────────────────────┐ │ │ │ │ │ Multi-Scale Concat │ │ │ │ │ └─────────────────────┘ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ ATTENTION LAYER │ │ │ │ │ │ │ │ ┌─────────────────┐ ┌─────────────────────────────────────┐ │ │ │ │ │ Self-Attention │────▶│ Temporal Attention Pooling │ │ │ │ │ │ (4 heads) │ │ (weighted avg over sequence) │ │ │ │ │ └─────────────────┘ └─────────────────────────────────────┘ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ XGBOOST HEAD │ │ │ │ │ │ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ │ │ Breakout │ │ Breakout │ │ │ │ │ │ Classifier │ │ Direction │ │ │ │ │ │ (binary) │ │ Classifier │ │ │ │ │ └──────────────────┘ └──────────────────┘ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ Output: VBPPrediction │ │ - breakout_prob: float (0 to 1) │ │ - breakout_direction: 'up' | 'down' | 'none' │ │ - squeeze_intensity: float │ │ - expected_magnitude: float │ └─────────────────────────────────────────────────────────────────────────┘ ``` ### CNN 1D Configuration ```python CNNConfig: conv_channels: [64, 128, 256] kernel_sizes: [7, 5, 3] pool_sizes: [2, 2, 2] activation: 'gelu' batch_norm: True dropout: 0.2 ``` | Layer | Filters | Kernel | Output | |-------|---------|--------|--------| | Conv1D_1 | 64 | 7 | Local patterns | | Conv1D_2 | 128 | 5 | Mid-range patterns | | Conv1D_3 | 256 | 3 | High-level abstractions | ### Attention Configuration | Parametro | Valor | |-----------|-------| | **n_heads** | 4 | | **attention_dim** | 128 | | **dropout** | 0.1 | | **pooling** | Temporal weighted average | --- ## Feature Engineering ### Volatility Features #### 1. ATR (Average True Range) ```python ATR Features: atr_14: ATR(14) atr_7: ATR(7) atr_28: ATR(28) atr_ratio: atr_7 / atr_28 atr_percentile: rolling percentile of ATR normalized_atr: atr / close ``` | Feature | Descripcion | Uso | |---------|-------------|-----| | `atr_14` | ATR estandar | Volatilidad base | | `atr_ratio` | Short/Long ATR | Expansion/contraccion | | `normalized_atr` | ATR como % del precio | Comparacion entre simbolos | #### 2. Bollinger Bands ```python BB Features: bb_upper: SMA(20) + 2*std bb_lower: SMA(20) - 2*std bb_width: (upper - lower) / middle bb_width_percentile: percentile(bb_width, 100) bb_squeeze: bb_width < percentile_20 bb_expansion: bb_width > percentile_80 bb_position: (close - lower) / (upper - lower) ``` | Feature | Formula | Interpretacion | |---------|---------|----------------| | `bb_width` | `(upper - lower) / middle` | Ancho relativo | | `bb_squeeze` | `bb_width < percentile(20)` | Compresion de volatilidad | | `bb_position` | Position 0-1 en bandas | Sobrecompra/sobreventa | #### 3. Keltner Channels ```python Keltner Features: kc_upper: EMA(20) + 2*ATR(10) kc_lower: EMA(20) - 2*ATR(10) kc_width: (upper - lower) / middle kc_squeeze: bb_lower > kc_lower AND bb_upper < kc_upper kc_position: (close - lower) / (upper - lower) ``` #### 4. Squeeze Detection ```python Squeeze Features: is_squeeze: BB inside Keltner squeeze_length: consecutive squeeze bars squeeze_momentum: Rate of change during squeeze squeeze_release: first bar after squeeze ends squeeze_intensity: bb_width / kc_width ``` | Estado | Condicion | Significado | |--------|-----------|-------------| | **SQUEEZE ON** | BB inside KC | Compresion activa | | **SQUEEZE OFF** | BB outside KC | Expansion iniciada | | **SQUEEZE RELEASE** | Transicion | Momento de breakout | ### Feature Summary ```python VBPFeatureConfig: atr_periods: [7, 14, 28] bb_period: 20 bb_std: 2.0 kc_period: 20 kc_atr_mult: 1.5 squeeze_lookback: 20 feature_count: 45 # Total features ``` --- ## Balanced Sampling ### Problema: Clase Desbalanceada Los eventos de breakout son raros (~5-10% de los datos), causando: - Modelo sesgado hacia clase mayoritaria - Bajo recall en breakouts - Metricas enganosas ### Solucion: 3x Oversample ```python class BalancedSampler: """Oversample eventos de breakout 3x""" def __init__(self, oversample_factor: float = 3.0): self.oversample_factor = oversample_factor def fit_resample(self, X: np.ndarray, y: np.ndarray): # Identificar breakouts (clase minoritaria) breakout_mask = y == 1 n_breakouts = breakout_mask.sum() n_normal = (~breakout_mask).sum() # Calcular samples adicionales target_breakouts = int(n_breakouts * self.oversample_factor) additional_samples = target_breakouts - n_breakouts # Resampling con reemplazo breakout_indices = np.where(breakout_mask)[0] resampled_indices = np.random.choice( breakout_indices, size=additional_samples, replace=True ) # Combinar X_resampled = np.vstack([X, X[resampled_indices]]) y_resampled = np.concatenate([y, y[resampled_indices]]) return X_resampled, y_resampled ``` ### Alternativas Consideradas | Metodo | Pro | Contra | Seleccion | |--------|-----|--------|-----------| | **SMOTE** | Genera nuevos samples | Puede crear ruido | No | | **Class Weights** | Simple | Menos efectivo para desbalance severo | Complemento | | **Oversample 3x** | Robusto, mantiene distribucion real | Riesgo de overfitting | **SI** | ### Configuracion XGBoost ```python xgb_params = { 'n_estimators': 300, 'max_depth': 5, 'learning_rate': 0.05, 'scale_pos_weight': 3, # Complementa el oversample 'subsample': 0.8, 'colsample_bytree': 0.7 } ``` --- ## Deteccion de Breakout ### Definicion de Breakout Un **breakout** se define cuando: 1. **Squeeze activo** durante al menos N candles 2. **Precio rompe** upper/lower band 3. **Volumen** por encima del promedio 4. **Movimiento confirmado** de X% ```python def label_breakouts( df: pd.DataFrame, squeeze_min_length: int = 6, breakout_threshold: float = 0.015, # 1.5% forward_bars: int = 12 ) -> np.ndarray: """Etiqueta breakouts para entrenamiento""" labels = np.zeros(len(df)) for i in range(len(df) - forward_bars): # Verificar squeeze activo if df['squeeze_length'].iloc[i] >= squeeze_min_length: # Verificar breakout en siguiente periodo future_high = df['high'].iloc[i+1:i+forward_bars+1].max() future_low = df['low'].iloc[i+1:i+forward_bars+1].min() up_move = (future_high - df['close'].iloc[i]) / df['close'].iloc[i] down_move = (df['close'].iloc[i] - future_low) / df['close'].iloc[i] if up_move >= breakout_threshold: labels[i] = 1 # Breakout Up elif down_move >= breakout_threshold: labels[i] = -1 # Breakout Down return labels ``` ### Tipos de Breakout | Tipo | Condicion | Accion Sugerida | |------|-----------|-----------------| | **BREAKOUT_UP** | Rompe upper band con volumen | Long | | **BREAKOUT_DOWN** | Rompe lower band con volumen | Short | | **FALSE_BREAKOUT** | Rompe pero revierte rapidamente | Fade | | **NO_BREAKOUT** | Squeeze sin resolucion | Wait | --- ## Pipeline de Entrenamiento ### Fase 1: Preparacion de Datos ```python # 1. Calcular features de volatilidad features = vbp_feature_engineer.compute_volatility_features(df) # 2. Etiquetar breakouts labels = label_breakouts(df, squeeze_min_length=6) # 3. Aplicar balanced sampling X_balanced, y_balanced = balanced_sampler.fit_resample(X, y) print(f"Original: {len(X)} samples, {(y==1).sum()} breakouts") print(f"Balanced: {len(X_balanced)} samples, {(y_balanced==1).sum()} breakouts") ``` ### Fase 2: Entrenamiento CNN ```python # Entrenar CNN + Attention cnn_model = VBPCNNModel(config) cnn_model.fit( X_train_sequences, y_train, epochs=50, batch_size=64, validation_data=(X_val, y_val) ) # Extraer features del backbone cnn_features = cnn_model.extract_features(X_train) ``` ### Fase 3: Entrenamiento XGBoost ```python # Combinar CNN features con features originales combined_features = np.concatenate([ cnn_features, volatility_features, squeeze_features ], axis=1) # Entrenar clasificador de breakout xgb_breakout = XGBClassifier(**xgb_params) xgb_breakout.fit(combined_features, breakout_labels) # Entrenar clasificador de direccion (solo en breakouts) breakout_mask = breakout_labels != 0 xgb_direction = XGBClassifier(**xgb_params) xgb_direction.fit( combined_features[breakout_mask], direction_labels[breakout_mask] ) ``` --- ## Metricas de Evaluacion ### Metricas de Breakout Detection | Metrica | Descripcion | Target | |---------|-------------|--------| | **Breakout Recall** | Deteccion de breakouts reales | >= 70% | | **Breakout Precision** | Breakouts predichos que ocurren | >= 50% | | **F1 Score** | Balance precision/recall | >= 0.55 | | **False Positive Rate** | Falsas alarmas | < 15% | ### Metricas de Direccion | Metrica | Descripcion | Target | |---------|-------------|--------| | **Direction Accuracy** | Direccion correcta cuando hay breakout | >= 65% | | **Timing Error** | Error en candles hasta breakout | < 3 candles | ### Metricas de Trading | Metrica | Descripcion | Target | |---------|-------------|--------| | **Profit Factor** | Gross profit / Gross loss | > 1.5 | | **Win Rate** | Trades ganadores / Total | > 55% | | **Avg Win/Loss Ratio** | Ganancia promedio / Perdida promedio | > 1.2 | --- ## API y Uso ### Clase Principal: VBPModel ```python from models.strategies.vbp import VBPModel, VBPConfig # Configuracion config = VBPConfig( conv_channels=[64, 128, 256], attention_heads=4, sequence_length=60, xgb_n_estimators=300, oversample_factor=3.0 ) # Inicializar modelo model = VBPModel(config) # Entrenar metrics = model.fit(df_train, df_val) # Prediccion predictions = model.predict(df_new) for pred in predictions: print(f"Breakout Prob: {pred.breakout_prob:.2%}") print(f"Direction: {pred.breakout_direction}") print(f"Squeeze Intensity: {pred.squeeze_intensity:.2f}") ``` ### Clase VBPPrediction ```python @dataclass class VBPPrediction: breakout_prob: float # Probabilidad de breakout breakout_direction: str # 'up', 'down', 'none' direction_confidence: float squeeze_intensity: float # Intensidad del squeeze actual squeeze_length: int # Candles en squeeze expected_magnitude: float # Magnitud esperada del breakout time_to_breakout: int # Candles estimados hasta breakout @property def is_high_probability(self) -> bool: return self.breakout_prob > 0.7 @property def signal(self) -> Optional[str]: if self.breakout_prob > 0.7 and self.direction_confidence > 0.6: return f"BREAKOUT_{self.breakout_direction.upper()}" elif self.squeeze_intensity > 0.8: return "SQUEEZE_ALERT" return None ``` ### Alertas de Squeeze ```python class SqueezeAlert: """Sistema de alertas para squeezes y breakouts""" def check_alerts(self, prediction: VBPPrediction) -> List[Dict]: alerts = [] # Alerta de squeeze intenso if prediction.squeeze_intensity > 0.9: alerts.append({ 'type': 'SQUEEZE_EXTREME', 'severity': 'HIGH', 'message': f'Squeeze extremo detectado ({prediction.squeeze_length} bars)' }) # Alerta de breakout inminente if prediction.breakout_prob > 0.8: alerts.append({ 'type': 'BREAKOUT_IMMINENT', 'severity': 'CRITICAL', 'message': f'Breakout {prediction.breakout_direction} probable' }) return alerts ``` --- ## Estructura de Archivos ``` apps/ml-engine/src/models/strategies/vbp/ ├── __init__.py ├── model.py # VBPModel, VBPConfig, VBPPrediction ├── feature_engineering.py # VBPFeatureEngineer (volatility features) ├── cnn_backbone.py # CNN 1D + Attention architecture ├── balanced_sampler.py # Oversample implementation └── trainer.py # VBPTrainer ``` --- ## Consideraciones de Produccion ### Deteccion en Tiempo Real ```python class VBPRealtime: def __init__(self, model_path: str): self.model = VBPModel.load(model_path) self.squeeze_tracker = SqueezeTracker() def on_candle(self, candle: Dict) -> Optional[VBPPrediction]: # Actualizar tracker self.squeeze_tracker.update(candle) # Solo predecir si hay squeeze activo if self.squeeze_tracker.is_squeeze: features = self.compute_features() return self.model.predict_single(features) return None ``` ### Configuracion de Alertas ```yaml # config/vbp_alerts.yml alerts: squeeze_alert: min_intensity: 0.8 min_length: 6 channels: ['telegram', 'webhook'] breakout_alert: min_probability: 0.75 min_direction_confidence: 0.6 channels: ['telegram', 'webhook', 'email'] ``` ### Performance Benchmarks | Operacion | Tiempo | Batch Size | |-----------|--------|------------| | Feature computation | 5ms | 1 | | CNN inference | 10ms | 1 | | Full prediction | 20ms | 1 | | Bulk inference | 500ms | 1000 | --- ## Referencias - [ET-ML-001: Arquitectura ML Engine](./ET-ML-001-arquitectura.md) - [ET-ML-010: PVA Strategy](./ET-ML-010-pva-strategy.md) - [Bollinger Bands](https://school.stockcharts.com/doku.php?id=technical_indicators:bollinger_bands) - [TTM Squeeze Indicator](https://www.tradingview.com/scripts/ttmsqueeze/) --- **Autor:** ML-Specialist (NEXUS v4.0) **Fecha:** 2026-01-25