trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/especificaciones/ET-ML-012-vbp-strategy.md
Adrian Flores Cortes f1174723ed feat: Add comprehensive analysis and integration plan for trading-platform
- Created TASK-2026-01-26-ANALYSIS-INTEGRATION-PLAN with complete CAPVED documentation
- Orchestrated 5 specialized Explore agents in parallel (85% time reduction)
- Identified 7 coherence gaps (DDL↔Backend↔Frontend)
- Identified 4 P0 blockers preventing GO-LIVE
- Documented 58 missing documentation items
- Created detailed roadmap Q1-Q4 2026 (2,500h total)
- Added 6 new ET specs for ML strategies (PVA, MRD, VBP, MSA, MTS, Backtesting)
- Updated _INDEX.yml with new analysis task

Hallazgos críticos:
- E-COH-001 to E-COH-007: Coherence gaps (6.5h to fix)
- BLOCKER-001 to 004: Token refresh, PCI-DSS, Video upload, MT4 Gateway (380h)
- Documentation gaps: 8 ET specs, 8 US, 34 Swagger docs (47.5h)

Roadmap phases:
- Q1: Security & Blockers (249h)
- Q2: Core Features + GO-LIVE (542h)
- Q3: Scalability & Performance (380h)
- Q4: Innovation & Advanced Features (1,514h)

ROI: $223k investment → $750k revenue → $468k net profit (165% ROI)

Next: Execute ST1 (Coherencia Fixes P0)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:40:56 -06:00

580 lines
20 KiB
Markdown

---
id: "ET-ML-012"
title: "VBP (Volatility Breakout Predictor) Strategy"
type: "Technical Specification"
status: "Approved"
priority: "Alta"
epic: "OQI-006"
project: "trading-platform"
version: "1.0.0"
created_date: "2026-01-25"
updated_date: "2026-01-25"
task_reference: "TASK-2026-01-25-ML-TRAINING-ENHANCEMENT"
---
# ET-ML-012: VBP (Volatility Breakout Predictor) Strategy
## Metadata
| Campo | Valor |
|-------|-------|
| **ID** | ET-ML-012 |
| **Epica** | OQI-006 - Senales ML |
| **Tipo** | Especificacion Tecnica |
| **Version** | 1.0.0 |
| **Estado** | Aprobado |
| **Ultima actualizacion** | 2026-01-25 |
| **Tarea Referencia** | TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |
---
## Resumen
La estrategia VBP (Volatility Breakout Predictor) predice rupturas de volatilidad inminentes usando una arquitectura de **CNN 1D** con **Attention** y **XGBoost**. El modelo se especializa en detectar compresiones de volatilidad que preceden a movimientos explosivos.
### Caracteristicas Clave
- **CNN 1D + Attention**: Extrae patrones locales y relaciones globales
- **Features de Volatilidad**: ATR, Bollinger Bands, Keltner Channels
- **Deteccion de Squeeze**: Identifica compresiones de volatilidad
- **Balanced Sampling**: 3x oversample de eventos de breakout
---
## Arquitectura
### Diagrama de Alto Nivel
```
┌─────────────────────────────────────────────────────────────────────────┐
│ VBP MODEL │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Input: OHLCV Sequence (seq_len x n_features) │
│ └── Volatility features (ATR, BB, Keltner) │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ CNN 1D BACKBONE │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Conv1D │ │ Conv1D │ │ Conv1D │ │ │
│ │ │ 64 filters │──▶│ 128 filters │──▶│ 256 filters │ │ │
│ │ │ kernel=7 │ │ kernel=5 │ │ kernel=3 │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │ │ │ │
│ │ └────────────────┼────────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────┐ │ │
│ │ │ Multi-Scale Concat │ │ │
│ │ └─────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ ATTENTION LAYER │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────────────────────────┐ │ │
│ │ │ Self-Attention │────▶│ Temporal Attention Pooling │ │ │
│ │ │ (4 heads) │ │ (weighted avg over sequence) │ │ │
│ │ └─────────────────┘ └─────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ XGBOOST HEAD │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ │ Breakout │ │ Breakout │ │ │
│ │ │ Classifier │ │ Direction │ │ │
│ │ │ (binary) │ │ Classifier │ │ │
│ │ └──────────────────┘ └──────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Output: VBPPrediction │
│ - breakout_prob: float (0 to 1) │
│ - breakout_direction: 'up' | 'down' | 'none' │
│ - squeeze_intensity: float │
│ - expected_magnitude: float │
└─────────────────────────────────────────────────────────────────────────┘
```
### CNN 1D Configuration
```python
CNNConfig:
conv_channels: [64, 128, 256]
kernel_sizes: [7, 5, 3]
pool_sizes: [2, 2, 2]
activation: 'gelu'
batch_norm: True
dropout: 0.2
```
| Layer | Filters | Kernel | Output |
|-------|---------|--------|--------|
| Conv1D_1 | 64 | 7 | Local patterns |
| Conv1D_2 | 128 | 5 | Mid-range patterns |
| Conv1D_3 | 256 | 3 | High-level abstractions |
### Attention Configuration
| Parametro | Valor |
|-----------|-------|
| **n_heads** | 4 |
| **attention_dim** | 128 |
| **dropout** | 0.1 |
| **pooling** | Temporal weighted average |
---
## Feature Engineering
### Volatility Features
#### 1. ATR (Average True Range)
```python
ATR Features:
atr_14: ATR(14)
atr_7: ATR(7)
atr_28: ATR(28)
atr_ratio: atr_7 / atr_28
atr_percentile: rolling percentile of ATR
normalized_atr: atr / close
```
| Feature | Descripcion | Uso |
|---------|-------------|-----|
| `atr_14` | ATR estandar | Volatilidad base |
| `atr_ratio` | Short/Long ATR | Expansion/contraccion |
| `normalized_atr` | ATR como % del precio | Comparacion entre simbolos |
#### 2. Bollinger Bands
```python
BB Features:
bb_upper: SMA(20) + 2*std
bb_lower: SMA(20) - 2*std
bb_width: (upper - lower) / middle
bb_width_percentile: percentile(bb_width, 100)
bb_squeeze: bb_width < percentile_20
bb_expansion: bb_width > percentile_80
bb_position: (close - lower) / (upper - lower)
```
| Feature | Formula | Interpretacion |
|---------|---------|----------------|
| `bb_width` | `(upper - lower) / middle` | Ancho relativo |
| `bb_squeeze` | `bb_width < percentile(20)` | Compresion de volatilidad |
| `bb_position` | Position 0-1 en bandas | Sobrecompra/sobreventa |
#### 3. Keltner Channels
```python
Keltner Features:
kc_upper: EMA(20) + 2*ATR(10)
kc_lower: EMA(20) - 2*ATR(10)
kc_width: (upper - lower) / middle
kc_squeeze: bb_lower > kc_lower AND bb_upper < kc_upper
kc_position: (close - lower) / (upper - lower)
```
#### 4. Squeeze Detection
```python
Squeeze Features:
is_squeeze: BB inside Keltner
squeeze_length: consecutive squeeze bars
squeeze_momentum: Rate of change during squeeze
squeeze_release: first bar after squeeze ends
squeeze_intensity: bb_width / kc_width
```
| Estado | Condicion | Significado |
|--------|-----------|-------------|
| **SQUEEZE ON** | BB inside KC | Compresion activa |
| **SQUEEZE OFF** | BB outside KC | Expansion iniciada |
| **SQUEEZE RELEASE** | Transicion | Momento de breakout |
### Feature Summary
```python
VBPFeatureConfig:
atr_periods: [7, 14, 28]
bb_period: 20
bb_std: 2.0
kc_period: 20
kc_atr_mult: 1.5
squeeze_lookback: 20
feature_count: 45 # Total features
```
---
## Balanced Sampling
### Problema: Clase Desbalanceada
Los eventos de breakout son raros (~5-10% de los datos), causando:
- Modelo sesgado hacia clase mayoritaria
- Bajo recall en breakouts
- Metricas enganosas
### Solucion: 3x Oversample
```python
class BalancedSampler:
"""Oversample eventos de breakout 3x"""
def __init__(self, oversample_factor: float = 3.0):
self.oversample_factor = oversample_factor
def fit_resample(self, X: np.ndarray, y: np.ndarray):
# Identificar breakouts (clase minoritaria)
breakout_mask = y == 1
n_breakouts = breakout_mask.sum()
n_normal = (~breakout_mask).sum()
# Calcular samples adicionales
target_breakouts = int(n_breakouts * self.oversample_factor)
additional_samples = target_breakouts - n_breakouts
# Resampling con reemplazo
breakout_indices = np.where(breakout_mask)[0]
resampled_indices = np.random.choice(
breakout_indices,
size=additional_samples,
replace=True
)
# Combinar
X_resampled = np.vstack([X, X[resampled_indices]])
y_resampled = np.concatenate([y, y[resampled_indices]])
return X_resampled, y_resampled
```
### Alternativas Consideradas
| Metodo | Pro | Contra | Seleccion |
|--------|-----|--------|-----------|
| **SMOTE** | Genera nuevos samples | Puede crear ruido | No |
| **Class Weights** | Simple | Menos efectivo para desbalance severo | Complemento |
| **Oversample 3x** | Robusto, mantiene distribucion real | Riesgo de overfitting | **SI** |
### Configuracion XGBoost
```python
xgb_params = {
'n_estimators': 300,
'max_depth': 5,
'learning_rate': 0.05,
'scale_pos_weight': 3, # Complementa el oversample
'subsample': 0.8,
'colsample_bytree': 0.7
}
```
---
## Deteccion de Breakout
### Definicion de Breakout
Un **breakout** se define cuando:
1. **Squeeze activo** durante al menos N candles
2. **Precio rompe** upper/lower band
3. **Volumen** por encima del promedio
4. **Movimiento confirmado** de X%
```python
def label_breakouts(
df: pd.DataFrame,
squeeze_min_length: int = 6,
breakout_threshold: float = 0.015, # 1.5%
forward_bars: int = 12
) -> np.ndarray:
"""Etiqueta breakouts para entrenamiento"""
labels = np.zeros(len(df))
for i in range(len(df) - forward_bars):
# Verificar squeeze activo
if df['squeeze_length'].iloc[i] >= squeeze_min_length:
# Verificar breakout en siguiente periodo
future_high = df['high'].iloc[i+1:i+forward_bars+1].max()
future_low = df['low'].iloc[i+1:i+forward_bars+1].min()
up_move = (future_high - df['close'].iloc[i]) / df['close'].iloc[i]
down_move = (df['close'].iloc[i] - future_low) / df['close'].iloc[i]
if up_move >= breakout_threshold:
labels[i] = 1 # Breakout Up
elif down_move >= breakout_threshold:
labels[i] = -1 # Breakout Down
return labels
```
### Tipos de Breakout
| Tipo | Condicion | Accion Sugerida |
|------|-----------|-----------------|
| **BREAKOUT_UP** | Rompe upper band con volumen | Long |
| **BREAKOUT_DOWN** | Rompe lower band con volumen | Short |
| **FALSE_BREAKOUT** | Rompe pero revierte rapidamente | Fade |
| **NO_BREAKOUT** | Squeeze sin resolucion | Wait |
---
## Pipeline de Entrenamiento
### Fase 1: Preparacion de Datos
```python
# 1. Calcular features de volatilidad
features = vbp_feature_engineer.compute_volatility_features(df)
# 2. Etiquetar breakouts
labels = label_breakouts(df, squeeze_min_length=6)
# 3. Aplicar balanced sampling
X_balanced, y_balanced = balanced_sampler.fit_resample(X, y)
print(f"Original: {len(X)} samples, {(y==1).sum()} breakouts")
print(f"Balanced: {len(X_balanced)} samples, {(y_balanced==1).sum()} breakouts")
```
### Fase 2: Entrenamiento CNN
```python
# Entrenar CNN + Attention
cnn_model = VBPCNNModel(config)
cnn_model.fit(
X_train_sequences,
y_train,
epochs=50,
batch_size=64,
validation_data=(X_val, y_val)
)
# Extraer features del backbone
cnn_features = cnn_model.extract_features(X_train)
```
### Fase 3: Entrenamiento XGBoost
```python
# Combinar CNN features con features originales
combined_features = np.concatenate([
cnn_features,
volatility_features,
squeeze_features
], axis=1)
# Entrenar clasificador de breakout
xgb_breakout = XGBClassifier(**xgb_params)
xgb_breakout.fit(combined_features, breakout_labels)
# Entrenar clasificador de direccion (solo en breakouts)
breakout_mask = breakout_labels != 0
xgb_direction = XGBClassifier(**xgb_params)
xgb_direction.fit(
combined_features[breakout_mask],
direction_labels[breakout_mask]
)
```
---
## Metricas de Evaluacion
### Metricas de Breakout Detection
| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Breakout Recall** | Deteccion de breakouts reales | >= 70% |
| **Breakout Precision** | Breakouts predichos que ocurren | >= 50% |
| **F1 Score** | Balance precision/recall | >= 0.55 |
| **False Positive Rate** | Falsas alarmas | < 15% |
### Metricas de Direccion
| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Direction Accuracy** | Direccion correcta cuando hay breakout | >= 65% |
| **Timing Error** | Error en candles hasta breakout | < 3 candles |
### Metricas de Trading
| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Profit Factor** | Gross profit / Gross loss | > 1.5 |
| **Win Rate** | Trades ganadores / Total | > 55% |
| **Avg Win/Loss Ratio** | Ganancia promedio / Perdida promedio | > 1.2 |
---
## API y Uso
### Clase Principal: VBPModel
```python
from models.strategies.vbp import VBPModel, VBPConfig
# Configuracion
config = VBPConfig(
conv_channels=[64, 128, 256],
attention_heads=4,
sequence_length=60,
xgb_n_estimators=300,
oversample_factor=3.0
)
# Inicializar modelo
model = VBPModel(config)
# Entrenar
metrics = model.fit(df_train, df_val)
# Prediccion
predictions = model.predict(df_new)
for pred in predictions:
print(f"Breakout Prob: {pred.breakout_prob:.2%}")
print(f"Direction: {pred.breakout_direction}")
print(f"Squeeze Intensity: {pred.squeeze_intensity:.2f}")
```
### Clase VBPPrediction
```python
@dataclass
class VBPPrediction:
breakout_prob: float # Probabilidad de breakout
breakout_direction: str # 'up', 'down', 'none'
direction_confidence: float
squeeze_intensity: float # Intensidad del squeeze actual
squeeze_length: int # Candles en squeeze
expected_magnitude: float # Magnitud esperada del breakout
time_to_breakout: int # Candles estimados hasta breakout
@property
def is_high_probability(self) -> bool:
return self.breakout_prob > 0.7
@property
def signal(self) -> Optional[str]:
if self.breakout_prob > 0.7 and self.direction_confidence > 0.6:
return f"BREAKOUT_{self.breakout_direction.upper()}"
elif self.squeeze_intensity > 0.8:
return "SQUEEZE_ALERT"
return None
```
### Alertas de Squeeze
```python
class SqueezeAlert:
"""Sistema de alertas para squeezes y breakouts"""
def check_alerts(self, prediction: VBPPrediction) -> List[Dict]:
alerts = []
# Alerta de squeeze intenso
if prediction.squeeze_intensity > 0.9:
alerts.append({
'type': 'SQUEEZE_EXTREME',
'severity': 'HIGH',
'message': f'Squeeze extremo detectado ({prediction.squeeze_length} bars)'
})
# Alerta de breakout inminente
if prediction.breakout_prob > 0.8:
alerts.append({
'type': 'BREAKOUT_IMMINENT',
'severity': 'CRITICAL',
'message': f'Breakout {prediction.breakout_direction} probable'
})
return alerts
```
---
## Estructura de Archivos
```
apps/ml-engine/src/models/strategies/vbp/
├── __init__.py
├── model.py # VBPModel, VBPConfig, VBPPrediction
├── feature_engineering.py # VBPFeatureEngineer (volatility features)
├── cnn_backbone.py # CNN 1D + Attention architecture
├── balanced_sampler.py # Oversample implementation
└── trainer.py # VBPTrainer
```
---
## Consideraciones de Produccion
### Deteccion en Tiempo Real
```python
class VBPRealtime:
def __init__(self, model_path: str):
self.model = VBPModel.load(model_path)
self.squeeze_tracker = SqueezeTracker()
def on_candle(self, candle: Dict) -> Optional[VBPPrediction]:
# Actualizar tracker
self.squeeze_tracker.update(candle)
# Solo predecir si hay squeeze activo
if self.squeeze_tracker.is_squeeze:
features = self.compute_features()
return self.model.predict_single(features)
return None
```
### Configuracion de Alertas
```yaml
# config/vbp_alerts.yml
alerts:
squeeze_alert:
min_intensity: 0.8
min_length: 6
channels: ['telegram', 'webhook']
breakout_alert:
min_probability: 0.75
min_direction_confidence: 0.6
channels: ['telegram', 'webhook', 'email']
```
### Performance Benchmarks
| Operacion | Tiempo | Batch Size |
|-----------|--------|------------|
| Feature computation | 5ms | 1 |
| CNN inference | 10ms | 1 |
| Full prediction | 20ms | 1 |
| Bulk inference | 500ms | 1000 |
---
## Referencias
- [ET-ML-001: Arquitectura ML Engine](./ET-ML-001-arquitectura.md)
- [ET-ML-010: PVA Strategy](./ET-ML-010-pva-strategy.md)
- [Bollinger Bands](https://school.stockcharts.com/doku.php?id=technical_indicators:bollinger_bands)
- [TTM Squeeze Indicator](https://www.tradingview.com/scripts/ttmsqueeze/)
---
**Autor:** ML-Specialist (NEXUS v4.0)
**Fecha:** 2026-01-25