trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/especificaciones/ET-ML-012-vbp-strategy.md

---
id: "ET-ML-012"
title: "VBP (Volatility Breakout Predictor) Strategy"
type: "Technical Specification"
status: "Approved"
priority: "Alta"
epic: "OQI-006"
project: "trading-platform"
version: "1.0.0"
created_date: "2026-01-25"
updated_date: "2026-01-25"
task_reference: "TASK-2026-01-25-ML-TRAINING-ENHANCEMENT"
---

# ET-ML-012: VBP (Volatility Breakout Predictor) Strategy

## Metadata

| Campo | Valor |
|-------|-------|
| **ID** | ET-ML-012 |
| **Epica** | OQI-006 - Senales ML |
| **Tipo** | Especificacion Tecnica |
| **Version** | 1.0.0 |
| **Estado** | Aprobado |
| **Ultima actualizacion** | 2026-01-25 |
| **Tarea Referencia** | TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |

---

## Resumen

La estrategia VBP (Volatility Breakout Predictor) predice rupturas de volatilidad inminentes usando una arquitectura de **CNN 1D** con **Attention** y **XGBoost**. El modelo se especializa en detectar compresiones de volatilidad que preceden a movimientos explosivos.

### Caracteristicas Clave

- **CNN 1D + Attention**: Extrae patrones locales y relaciones globales
- **Features de Volatilidad**: ATR, Bollinger Bands, Keltner Channels
- **Deteccion de Squeeze**: Identifica compresiones de volatilidad
- **Balanced Sampling**: 3x oversample de eventos de breakout

---

## Arquitectura

### Diagrama de Alto Nivel

```
┌─────────────────────────────────────────────────────────────────────────┐
│                           VBP MODEL                                      │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Input: OHLCV Sequence (seq_len x n_features)                           │
│         └── Volatility features (ATR, BB, Keltner)                      │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                     CNN 1D BACKBONE                                │  │
│  │                                                                     │  │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐               │  │
│  │  │  Conv1D     │  │  Conv1D     │  │  Conv1D     │               │  │
│  │  │  64 filters │──▶│ 128 filters │──▶│ 256 filters │              │  │
│  │  │  kernel=7   │  │  kernel=5   │  │  kernel=3   │               │  │
│  │  └─────────────┘  └─────────────┘  └─────────────┘               │  │
│  │         │                │                │                        │  │
│  │         └────────────────┼────────────────┘                        │  │
│  │                          ▼                                          │  │
│  │               ┌─────────────────────┐                              │  │
│  │               │  Multi-Scale Concat │                              │  │
│  │               └─────────────────────┘                              │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                   ATTENTION LAYER                                   │  │
│  │                                                                     │  │
│  │  ┌─────────────────┐     ┌─────────────────────────────────────┐  │  │
│  │  │ Self-Attention  │────▶│ Temporal Attention Pooling          │  │  │
│  │  │ (4 heads)       │     │ (weighted avg over sequence)        │  │  │
│  │  └─────────────────┘     └─────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                    XGBOOST HEAD                                    │  │
│  │                                                                     │  │
│  │  ┌──────────────────┐  ┌──────────────────┐                       │  │
│  │  │ Breakout         │  │ Breakout         │                       │  │
│  │  │ Classifier       │  │ Direction        │                       │  │
│  │  │ (binary)         │  │ Classifier       │                       │  │
│  │  └──────────────────┘  └──────────────────┘                       │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  Output: VBPPrediction                                                   │
│          - breakout_prob: float (0 to 1)                                │
│          - breakout_direction: 'up' | 'down' | 'none'                   │
│          - squeeze_intensity: float                                      │
│          - expected_magnitude: float                                     │
└─────────────────────────────────────────────────────────────────────────┘
```

### CNN 1D Configuration

```python
CNNConfig:
    conv_channels: [64, 128, 256]
    kernel_sizes: [7, 5, 3]
    pool_sizes: [2, 2, 2]
    activation: 'gelu'
    batch_norm: True
    dropout: 0.2
```

| Layer | Filters | Kernel | Output |
|-------|---------|--------|--------|
| Conv1D_1 | 64 | 7 | Local patterns |
| Conv1D_2 | 128 | 5 | Mid-range patterns |
| Conv1D_3 | 256 | 3 | High-level abstractions |

### Attention Configuration

| Parametro | Valor |
|-----------|-------|
| **n_heads** | 4 |
| **attention_dim** | 128 |
| **dropout** | 0.1 |
| **pooling** | Temporal weighted average |

---

## Feature Engineering

### Volatility Features

#### 1. ATR (Average True Range)

```python
ATR Features:
    atr_14: ATR(14)
    atr_7: ATR(7)
    atr_28: ATR(28)
    atr_ratio: atr_7 / atr_28
    atr_percentile: rolling percentile of ATR
    normalized_atr: atr / close
```

| Feature | Descripcion | Uso |
|---------|-------------|-----|
| `atr_14` | ATR estandar | Volatilidad base |
| `atr_ratio` | Short/Long ATR | Expansion/contraccion |
| `normalized_atr` | ATR como % del precio | Comparacion entre simbolos |

#### 2. Bollinger Bands

```python
BB Features:
    bb_upper: SMA(20) + 2*std
    bb_lower: SMA(20) - 2*std
    bb_width: (upper - lower) / middle
    bb_width_percentile: percentile(bb_width, 100)
    bb_squeeze: bb_width < percentile_20
    bb_expansion: bb_width > percentile_80
    bb_position: (close - lower) / (upper - lower)
```

| Feature | Formula | Interpretacion |
|---------|---------|----------------|
| `bb_width` | `(upper - lower) / middle` | Ancho relativo |
| `bb_squeeze` | `bb_width < percentile(20)` | Compresion de volatilidad |
| `bb_position` | Position 0-1 en bandas | Sobrecompra/sobreventa |

#### 3. Keltner Channels

```python
Keltner Features:
    kc_upper: EMA(20) + 2*ATR(10)
    kc_lower: EMA(20) - 2*ATR(10)
    kc_width: (upper - lower) / middle
    kc_squeeze: bb_lower > kc_lower AND bb_upper < kc_upper
    kc_position: (close - lower) / (upper - lower)
```

#### 4. Squeeze Detection

```python
Squeeze Features:
    is_squeeze: BB inside Keltner
    squeeze_length: consecutive squeeze bars
    squeeze_momentum: Rate of change during squeeze
    squeeze_release: first bar after squeeze ends
    squeeze_intensity: bb_width / kc_width
```

| Estado | Condicion | Significado |
|--------|-----------|-------------|
| **SQUEEZE ON** | BB inside KC | Compresion activa |
| **SQUEEZE OFF** | BB outside KC | Expansion iniciada |
| **SQUEEZE RELEASE** | Transicion | Momento de breakout |

### Feature Summary

```python
VBPFeatureConfig:
    atr_periods: [7, 14, 28]
    bb_period: 20
    bb_std: 2.0
    kc_period: 20
    kc_atr_mult: 1.5
    squeeze_lookback: 20
    feature_count: 45  # Total features
```

---

## Balanced Sampling

### Problema: Clase Desbalanceada

Los eventos de breakout son raros (~5-10% de los datos), causando:
- Modelo sesgado hacia clase mayoritaria
- Bajo recall en breakouts
- Metricas enganosas

### Solucion: 3x Oversample

```python
class BalancedSampler:
    """Oversample eventos de breakout 3x"""

    def __init__(self, oversample_factor: float = 3.0):
        self.oversample_factor = oversample_factor

    def fit_resample(self, X: np.ndarray, y: np.ndarray):
        # Identificar breakouts (clase minoritaria)
        breakout_mask = y == 1
        n_breakouts = breakout_mask.sum()
        n_normal = (~breakout_mask).sum()

        # Calcular samples adicionales
        target_breakouts = int(n_breakouts * self.oversample_factor)
        additional_samples = target_breakouts - n_breakouts

        # Resampling con reemplazo
        breakout_indices = np.where(breakout_mask)[0]
        resampled_indices = np.random.choice(
            breakout_indices,
            size=additional_samples,
            replace=True
        )

        # Combinar
        X_resampled = np.vstack([X, X[resampled_indices]])
        y_resampled = np.concatenate([y, y[resampled_indices]])

        return X_resampled, y_resampled
```

### Alternativas Consideradas

| Metodo | Pro | Contra | Seleccion |
|--------|-----|--------|-----------|
| **SMOTE** | Genera nuevos samples | Puede crear ruido | No |
| **Class Weights** | Simple | Menos efectivo para desbalance severo | Complemento |
| **Oversample 3x** | Robusto, mantiene distribucion real | Riesgo de overfitting | **SI** |

### Configuracion XGBoost

```python
xgb_params = {
    'n_estimators': 300,
    'max_depth': 5,
    'learning_rate': 0.05,
    'scale_pos_weight': 3,  # Complementa el oversample
    'subsample': 0.8,
    'colsample_bytree': 0.7
}
```

---

## Deteccion de Breakout

### Definicion de Breakout

Un **breakout** se define cuando:

1. **Squeeze activo** durante al menos N candles
2. **Precio rompe** upper/lower band
3. **Volumen** por encima del promedio
4. **Movimiento confirmado** de X%

```python
def label_breakouts(
    df: pd.DataFrame,
    squeeze_min_length: int = 6,
    breakout_threshold: float = 0.015,  # 1.5%
    forward_bars: int = 12
) -> np.ndarray:
    """Etiqueta breakouts para entrenamiento"""
    labels = np.zeros(len(df))

    for i in range(len(df) - forward_bars):
        # Verificar squeeze activo
        if df['squeeze_length'].iloc[i] >= squeeze_min_length:
            # Verificar breakout en siguiente periodo
            future_high = df['high'].iloc[i+1:i+forward_bars+1].max()
            future_low = df['low'].iloc[i+1:i+forward_bars+1].min()

            up_move = (future_high - df['close'].iloc[i]) / df['close'].iloc[i]
            down_move = (df['close'].iloc[i] - future_low) / df['close'].iloc[i]

            if up_move >= breakout_threshold:
                labels[i] = 1  # Breakout Up
            elif down_move >= breakout_threshold:
                labels[i] = -1  # Breakout Down

    return labels
```

### Tipos de Breakout

| Tipo | Condicion | Accion Sugerida |
|------|-----------|-----------------|
| **BREAKOUT_UP** | Rompe upper band con volumen | Long |
| **BREAKOUT_DOWN** | Rompe lower band con volumen | Short |
| **FALSE_BREAKOUT** | Rompe pero revierte rapidamente | Fade |
| **NO_BREAKOUT** | Squeeze sin resolucion | Wait |

---

## Pipeline de Entrenamiento

### Fase 1: Preparacion de Datos

```python
# 1. Calcular features de volatilidad
features = vbp_feature_engineer.compute_volatility_features(df)

# 2. Etiquetar breakouts
labels = label_breakouts(df, squeeze_min_length=6)

# 3. Aplicar balanced sampling
X_balanced, y_balanced = balanced_sampler.fit_resample(X, y)

print(f"Original: {len(X)} samples, {(y==1).sum()} breakouts")
print(f"Balanced: {len(X_balanced)} samples, {(y_balanced==1).sum()} breakouts")
```

### Fase 2: Entrenamiento CNN

```python
# Entrenar CNN + Attention
cnn_model = VBPCNNModel(config)
cnn_model.fit(
    X_train_sequences,
    y_train,
    epochs=50,
    batch_size=64,
    validation_data=(X_val, y_val)
)

# Extraer features del backbone
cnn_features = cnn_model.extract_features(X_train)
```

### Fase 3: Entrenamiento XGBoost

```python
# Combinar CNN features con features originales
combined_features = np.concatenate([
    cnn_features,
    volatility_features,
    squeeze_features
], axis=1)

# Entrenar clasificador de breakout
xgb_breakout = XGBClassifier(**xgb_params)
xgb_breakout.fit(combined_features, breakout_labels)

# Entrenar clasificador de direccion (solo en breakouts)
breakout_mask = breakout_labels != 0
xgb_direction = XGBClassifier(**xgb_params)
xgb_direction.fit(
    combined_features[breakout_mask],
    direction_labels[breakout_mask]
)
```

---

## Metricas de Evaluacion

### Metricas de Breakout Detection

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Breakout Recall** | Deteccion de breakouts reales | >= 70% |
| **Breakout Precision** | Breakouts predichos que ocurren | >= 50% |
| **F1 Score** | Balance precision/recall | >= 0.55 |
| **False Positive Rate** | Falsas alarmas | < 15% |

### Metricas de Direccion

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Direction Accuracy** | Direccion correcta cuando hay breakout | >= 65% |
| **Timing Error** | Error en candles hasta breakout | < 3 candles |

### Metricas de Trading

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Profit Factor** | Gross profit / Gross loss | > 1.5 |
| **Win Rate** | Trades ganadores / Total | > 55% |
| **Avg Win/Loss Ratio** | Ganancia promedio / Perdida promedio | > 1.2 |

---

## API y Uso

### Clase Principal: VBPModel

```python
from models.strategies.vbp import VBPModel, VBPConfig

# Configuracion
config = VBPConfig(
    conv_channels=[64, 128, 256],
    attention_heads=4,
    sequence_length=60,
    xgb_n_estimators=300,
    oversample_factor=3.0
)

# Inicializar modelo
model = VBPModel(config)

# Entrenar
metrics = model.fit(df_train, df_val)

# Prediccion
predictions = model.predict(df_new)
for pred in predictions:
    print(f"Breakout Prob: {pred.breakout_prob:.2%}")
    print(f"Direction: {pred.breakout_direction}")
    print(f"Squeeze Intensity: {pred.squeeze_intensity:.2f}")
```

### Clase VBPPrediction

```python
@dataclass
class VBPPrediction:
    breakout_prob: float        # Probabilidad de breakout
    breakout_direction: str     # 'up', 'down', 'none'
    direction_confidence: float
    squeeze_intensity: float    # Intensidad del squeeze actual
    squeeze_length: int         # Candles en squeeze
    expected_magnitude: float   # Magnitud esperada del breakout
    time_to_breakout: int      # Candles estimados hasta breakout

    @property
    def is_high_probability(self) -> bool:
        return self.breakout_prob > 0.7

    @property
    def signal(self) -> Optional[str]:
        if self.breakout_prob > 0.7 and self.direction_confidence > 0.6:
            return f"BREAKOUT_{self.breakout_direction.upper()}"
        elif self.squeeze_intensity > 0.8:
            return "SQUEEZE_ALERT"
        return None
```

### Alertas de Squeeze

```python
class SqueezeAlert:
    """Sistema de alertas para squeezes y breakouts"""

    def check_alerts(self, prediction: VBPPrediction) -> List[Dict]:
        alerts = []

        # Alerta de squeeze intenso
        if prediction.squeeze_intensity > 0.9:
            alerts.append({
                'type': 'SQUEEZE_EXTREME',
                'severity': 'HIGH',
                'message': f'Squeeze extremo detectado ({prediction.squeeze_length} bars)'
            })

        # Alerta de breakout inminente
        if prediction.breakout_prob > 0.8:
            alerts.append({
                'type': 'BREAKOUT_IMMINENT',
                'severity': 'CRITICAL',
                'message': f'Breakout {prediction.breakout_direction} probable'
            })

        return alerts
```

---

## Estructura de Archivos

```
apps/ml-engine/src/models/strategies/vbp/
├── __init__.py
├── model.py              # VBPModel, VBPConfig, VBPPrediction
├── feature_engineering.py # VBPFeatureEngineer (volatility features)
├── cnn_backbone.py       # CNN 1D + Attention architecture
├── balanced_sampler.py   # Oversample implementation
└── trainer.py            # VBPTrainer
```

---

## Consideraciones de Produccion

### Deteccion en Tiempo Real

```python
class VBPRealtime:
    def __init__(self, model_path: str):
        self.model = VBPModel.load(model_path)
        self.squeeze_tracker = SqueezeTracker()

    def on_candle(self, candle: Dict) -> Optional[VBPPrediction]:
        # Actualizar tracker
        self.squeeze_tracker.update(candle)

        # Solo predecir si hay squeeze activo
        if self.squeeze_tracker.is_squeeze:
            features = self.compute_features()
            return self.model.predict_single(features)

        return None
```

### Configuracion de Alertas

```yaml
# config/vbp_alerts.yml
alerts:
  squeeze_alert:
    min_intensity: 0.8
    min_length: 6
    channels: ['telegram', 'webhook']

  breakout_alert:
    min_probability: 0.75
    min_direction_confidence: 0.6
    channels: ['telegram', 'webhook', 'email']
```

### Performance Benchmarks

| Operacion | Tiempo | Batch Size |
|-----------|--------|------------|
| Feature computation | 5ms | 1 |
| CNN inference | 10ms | 1 |
| Full prediction | 20ms | 1 |
| Bulk inference | 500ms | 1000 |

---

## Referencias

- [ET-ML-001: Arquitectura ML Engine](./ET-ML-001-arquitectura.md)
- [ET-ML-010: PVA Strategy](./ET-ML-010-pva-strategy.md)
- [Bollinger Bands](https://school.stockcharts.com/doku.php?id=technical_indicators:bollinger_bands)
- [TTM Squeeze Indicator](https://www.tradingview.com/scripts/ttmsqueeze/)

---

**Autor:** ML-Specialist (NEXUS v4.0)
**Fecha:** 2026-01-25