| id |
title |
type |
status |
priority |
epic |
project |
version |
created_date |
updated_date |
task_reference |
| ET-ML-012 |
VBP (Volatility Breakout Predictor) Strategy |
Technical Specification |
Approved |
Alta |
OQI-006 |
trading-platform |
1.0.0 |
2026-01-25 |
2026-01-25 |
TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |
ET-ML-012: VBP (Volatility Breakout Predictor) Strategy
Metadata
| Campo |
Valor |
| ID |
ET-ML-012 |
| Epica |
OQI-006 - Senales ML |
| Tipo |
Especificacion Tecnica |
| Version |
1.0.0 |
| Estado |
Aprobado |
| Ultima actualizacion |
2026-01-25 |
| Tarea Referencia |
TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |
Resumen
La estrategia VBP (Volatility Breakout Predictor) predice rupturas de volatilidad inminentes usando una arquitectura de CNN 1D con Attention y XGBoost. El modelo se especializa en detectar compresiones de volatilidad que preceden a movimientos explosivos.
Caracteristicas Clave
- CNN 1D + Attention: Extrae patrones locales y relaciones globales
- Features de Volatilidad: ATR, Bollinger Bands, Keltner Channels
- Deteccion de Squeeze: Identifica compresiones de volatilidad
- Balanced Sampling: 3x oversample de eventos de breakout
Arquitectura
Diagrama de Alto Nivel
┌─────────────────────────────────────────────────────────────────────────┐
│ VBP MODEL │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Input: OHLCV Sequence (seq_len x n_features) │
│ └── Volatility features (ATR, BB, Keltner) │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ CNN 1D BACKBONE │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Conv1D │ │ Conv1D │ │ Conv1D │ │ │
│ │ │ 64 filters │──▶│ 128 filters │──▶│ 256 filters │ │ │
│ │ │ kernel=7 │ │ kernel=5 │ │ kernel=3 │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │ │ │ │
│ │ └────────────────┼────────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────┐ │ │
│ │ │ Multi-Scale Concat │ │ │
│ │ └─────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ ATTENTION LAYER │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────────────────────────┐ │ │
│ │ │ Self-Attention │────▶│ Temporal Attention Pooling │ │ │
│ │ │ (4 heads) │ │ (weighted avg over sequence) │ │ │
│ │ └─────────────────┘ └─────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ XGBOOST HEAD │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ │ Breakout │ │ Breakout │ │ │
│ │ │ Classifier │ │ Direction │ │ │
│ │ │ (binary) │ │ Classifier │ │ │
│ │ └──────────────────┘ └──────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Output: VBPPrediction │
│ - breakout_prob: float (0 to 1) │
│ - breakout_direction: 'up' | 'down' | 'none' │
│ - squeeze_intensity: float │
│ - expected_magnitude: float │
└─────────────────────────────────────────────────────────────────────────┘
CNN 1D Configuration
CNNConfig:
conv_channels: [64, 128, 256]
kernel_sizes: [7, 5, 3]
pool_sizes: [2, 2, 2]
activation: 'gelu'
batch_norm: True
dropout: 0.2
| Layer |
Filters |
Kernel |
Output |
| Conv1D_1 |
64 |
7 |
Local patterns |
| Conv1D_2 |
128 |
5 |
Mid-range patterns |
| Conv1D_3 |
256 |
3 |
High-level abstractions |
Attention Configuration
| Parametro |
Valor |
| n_heads |
4 |
| attention_dim |
128 |
| dropout |
0.1 |
| pooling |
Temporal weighted average |
Feature Engineering
Volatility Features
1. ATR (Average True Range)
ATR Features:
atr_14: ATR(14)
atr_7: ATR(7)
atr_28: ATR(28)
atr_ratio: atr_7 / atr_28
atr_percentile: rolling percentile of ATR
normalized_atr: atr / close
| Feature |
Descripcion |
Uso |
atr_14 |
ATR estandar |
Volatilidad base |
atr_ratio |
Short/Long ATR |
Expansion/contraccion |
normalized_atr |
ATR como % del precio |
Comparacion entre simbolos |
2. Bollinger Bands
BB Features:
bb_upper: SMA(20) + 2*std
bb_lower: SMA(20) - 2*std
bb_width: (upper - lower) / middle
bb_width_percentile: percentile(bb_width, 100)
bb_squeeze: bb_width < percentile_20
bb_expansion: bb_width > percentile_80
bb_position: (close - lower) / (upper - lower)
| Feature |
Formula |
Interpretacion |
bb_width |
(upper - lower) / middle |
Ancho relativo |
bb_squeeze |
bb_width < percentile(20) |
Compresion de volatilidad |
bb_position |
Position 0-1 en bandas |
Sobrecompra/sobreventa |
3. Keltner Channels
Keltner Features:
kc_upper: EMA(20) + 2*ATR(10)
kc_lower: EMA(20) - 2*ATR(10)
kc_width: (upper - lower) / middle
kc_squeeze: bb_lower > kc_lower AND bb_upper < kc_upper
kc_position: (close - lower) / (upper - lower)
4. Squeeze Detection
Squeeze Features:
is_squeeze: BB inside Keltner
squeeze_length: consecutive squeeze bars
squeeze_momentum: Rate of change during squeeze
squeeze_release: first bar after squeeze ends
squeeze_intensity: bb_width / kc_width
| Estado |
Condicion |
Significado |
| SQUEEZE ON |
BB inside KC |
Compresion activa |
| SQUEEZE OFF |
BB outside KC |
Expansion iniciada |
| SQUEEZE RELEASE |
Transicion |
Momento de breakout |
Feature Summary
VBPFeatureConfig:
atr_periods: [7, 14, 28]
bb_period: 20
bb_std: 2.0
kc_period: 20
kc_atr_mult: 1.5
squeeze_lookback: 20
feature_count: 45 # Total features
Balanced Sampling
Problema: Clase Desbalanceada
Los eventos de breakout son raros (~5-10% de los datos), causando:
- Modelo sesgado hacia clase mayoritaria
- Bajo recall en breakouts
- Metricas enganosas
Solucion: 3x Oversample
class BalancedSampler:
"""Oversample eventos de breakout 3x"""
def __init__(self, oversample_factor: float = 3.0):
self.oversample_factor = oversample_factor
def fit_resample(self, X: np.ndarray, y: np.ndarray):
# Identificar breakouts (clase minoritaria)
breakout_mask = y == 1
n_breakouts = breakout_mask.sum()
n_normal = (~breakout_mask).sum()
# Calcular samples adicionales
target_breakouts = int(n_breakouts * self.oversample_factor)
additional_samples = target_breakouts - n_breakouts
# Resampling con reemplazo
breakout_indices = np.where(breakout_mask)[0]
resampled_indices = np.random.choice(
breakout_indices,
size=additional_samples,
replace=True
)
# Combinar
X_resampled = np.vstack([X, X[resampled_indices]])
y_resampled = np.concatenate([y, y[resampled_indices]])
return X_resampled, y_resampled
Alternativas Consideradas
| Metodo |
Pro |
Contra |
Seleccion |
| SMOTE |
Genera nuevos samples |
Puede crear ruido |
No |
| Class Weights |
Simple |
Menos efectivo para desbalance severo |
Complemento |
| Oversample 3x |
Robusto, mantiene distribucion real |
Riesgo de overfitting |
SI |
Configuracion XGBoost
xgb_params = {
'n_estimators': 300,
'max_depth': 5,
'learning_rate': 0.05,
'scale_pos_weight': 3, # Complementa el oversample
'subsample': 0.8,
'colsample_bytree': 0.7
}
Deteccion de Breakout
Definicion de Breakout
Un breakout se define cuando:
- Squeeze activo durante al menos N candles
- Precio rompe upper/lower band
- Volumen por encima del promedio
- Movimiento confirmado de X%
def label_breakouts(
df: pd.DataFrame,
squeeze_min_length: int = 6,
breakout_threshold: float = 0.015, # 1.5%
forward_bars: int = 12
) -> np.ndarray:
"""Etiqueta breakouts para entrenamiento"""
labels = np.zeros(len(df))
for i in range(len(df) - forward_bars):
# Verificar squeeze activo
if df['squeeze_length'].iloc[i] >= squeeze_min_length:
# Verificar breakout en siguiente periodo
future_high = df['high'].iloc[i+1:i+forward_bars+1].max()
future_low = df['low'].iloc[i+1:i+forward_bars+1].min()
up_move = (future_high - df['close'].iloc[i]) / df['close'].iloc[i]
down_move = (df['close'].iloc[i] - future_low) / df['close'].iloc[i]
if up_move >= breakout_threshold:
labels[i] = 1 # Breakout Up
elif down_move >= breakout_threshold:
labels[i] = -1 # Breakout Down
return labels
Tipos de Breakout
| Tipo |
Condicion |
Accion Sugerida |
| BREAKOUT_UP |
Rompe upper band con volumen |
Long |
| BREAKOUT_DOWN |
Rompe lower band con volumen |
Short |
| FALSE_BREAKOUT |
Rompe pero revierte rapidamente |
Fade |
| NO_BREAKOUT |
Squeeze sin resolucion |
Wait |
Pipeline de Entrenamiento
Fase 1: Preparacion de Datos
# 1. Calcular features de volatilidad
features = vbp_feature_engineer.compute_volatility_features(df)
# 2. Etiquetar breakouts
labels = label_breakouts(df, squeeze_min_length=6)
# 3. Aplicar balanced sampling
X_balanced, y_balanced = balanced_sampler.fit_resample(X, y)
print(f"Original: {len(X)} samples, {(y==1).sum()} breakouts")
print(f"Balanced: {len(X_balanced)} samples, {(y_balanced==1).sum()} breakouts")
Fase 2: Entrenamiento CNN
# Entrenar CNN + Attention
cnn_model = VBPCNNModel(config)
cnn_model.fit(
X_train_sequences,
y_train,
epochs=50,
batch_size=64,
validation_data=(X_val, y_val)
)
# Extraer features del backbone
cnn_features = cnn_model.extract_features(X_train)
Fase 3: Entrenamiento XGBoost
# Combinar CNN features con features originales
combined_features = np.concatenate([
cnn_features,
volatility_features,
squeeze_features
], axis=1)
# Entrenar clasificador de breakout
xgb_breakout = XGBClassifier(**xgb_params)
xgb_breakout.fit(combined_features, breakout_labels)
# Entrenar clasificador de direccion (solo en breakouts)
breakout_mask = breakout_labels != 0
xgb_direction = XGBClassifier(**xgb_params)
xgb_direction.fit(
combined_features[breakout_mask],
direction_labels[breakout_mask]
)
Metricas de Evaluacion
Metricas de Breakout Detection
| Metrica |
Descripcion |
Target |
| Breakout Recall |
Deteccion de breakouts reales |
>= 70% |
| Breakout Precision |
Breakouts predichos que ocurren |
>= 50% |
| F1 Score |
Balance precision/recall |
>= 0.55 |
| False Positive Rate |
Falsas alarmas |
< 15% |
Metricas de Direccion
| Metrica |
Descripcion |
Target |
| Direction Accuracy |
Direccion correcta cuando hay breakout |
>= 65% |
| Timing Error |
Error en candles hasta breakout |
< 3 candles |
Metricas de Trading
| Metrica |
Descripcion |
Target |
| Profit Factor |
Gross profit / Gross loss |
> 1.5 |
| Win Rate |
Trades ganadores / Total |
> 55% |
| Avg Win/Loss Ratio |
Ganancia promedio / Perdida promedio |
> 1.2 |
API y Uso
Clase Principal: VBPModel
from models.strategies.vbp import VBPModel, VBPConfig
# Configuracion
config = VBPConfig(
conv_channels=[64, 128, 256],
attention_heads=4,
sequence_length=60,
xgb_n_estimators=300,
oversample_factor=3.0
)
# Inicializar modelo
model = VBPModel(config)
# Entrenar
metrics = model.fit(df_train, df_val)
# Prediccion
predictions = model.predict(df_new)
for pred in predictions:
print(f"Breakout Prob: {pred.breakout_prob:.2%}")
print(f"Direction: {pred.breakout_direction}")
print(f"Squeeze Intensity: {pred.squeeze_intensity:.2f}")
Clase VBPPrediction
@dataclass
class VBPPrediction:
breakout_prob: float # Probabilidad de breakout
breakout_direction: str # 'up', 'down', 'none'
direction_confidence: float
squeeze_intensity: float # Intensidad del squeeze actual
squeeze_length: int # Candles en squeeze
expected_magnitude: float # Magnitud esperada del breakout
time_to_breakout: int # Candles estimados hasta breakout
@property
def is_high_probability(self) -> bool:
return self.breakout_prob > 0.7
@property
def signal(self) -> Optional[str]:
if self.breakout_prob > 0.7 and self.direction_confidence > 0.6:
return f"BREAKOUT_{self.breakout_direction.upper()}"
elif self.squeeze_intensity > 0.8:
return "SQUEEZE_ALERT"
return None
Alertas de Squeeze
class SqueezeAlert:
"""Sistema de alertas para squeezes y breakouts"""
def check_alerts(self, prediction: VBPPrediction) -> List[Dict]:
alerts = []
# Alerta de squeeze intenso
if prediction.squeeze_intensity > 0.9:
alerts.append({
'type': 'SQUEEZE_EXTREME',
'severity': 'HIGH',
'message': f'Squeeze extremo detectado ({prediction.squeeze_length} bars)'
})
# Alerta de breakout inminente
if prediction.breakout_prob > 0.8:
alerts.append({
'type': 'BREAKOUT_IMMINENT',
'severity': 'CRITICAL',
'message': f'Breakout {prediction.breakout_direction} probable'
})
return alerts
Estructura de Archivos
apps/ml-engine/src/models/strategies/vbp/
├── __init__.py
├── model.py # VBPModel, VBPConfig, VBPPrediction
├── feature_engineering.py # VBPFeatureEngineer (volatility features)
├── cnn_backbone.py # CNN 1D + Attention architecture
├── balanced_sampler.py # Oversample implementation
└── trainer.py # VBPTrainer
Consideraciones de Produccion
Deteccion en Tiempo Real
class VBPRealtime:
def __init__(self, model_path: str):
self.model = VBPModel.load(model_path)
self.squeeze_tracker = SqueezeTracker()
def on_candle(self, candle: Dict) -> Optional[VBPPrediction]:
# Actualizar tracker
self.squeeze_tracker.update(candle)
# Solo predecir si hay squeeze activo
if self.squeeze_tracker.is_squeeze:
features = self.compute_features()
return self.model.predict_single(features)
return None
Configuracion de Alertas
# config/vbp_alerts.yml
alerts:
squeeze_alert:
min_intensity: 0.8
min_length: 6
channels: ['telegram', 'webhook']
breakout_alert:
min_probability: 0.75
min_direction_confidence: 0.6
channels: ['telegram', 'webhook', 'email']
Performance Benchmarks
| Operacion |
Tiempo |
Batch Size |
| Feature computation |
5ms |
1 |
| CNN inference |
10ms |
1 |
| Full prediction |
20ms |
1 |
| Bulk inference |
500ms |
1000 |
Referencias
Autor: ML-Specialist (NEXUS v4.0)
Fecha: 2026-01-25