- Created TASK-2026-01-26-ANALYSIS-INTEGRATION-PLAN with complete CAPVED documentation - Orchestrated 5 specialized Explore agents in parallel (85% time reduction) - Identified 7 coherence gaps (DDL↔Backend↔Frontend) - Identified 4 P0 blockers preventing GO-LIVE - Documented 58 missing documentation items - Created detailed roadmap Q1-Q4 2026 (2,500h total) - Added 6 new ET specs for ML strategies (PVA, MRD, VBP, MSA, MTS, Backtesting) - Updated _INDEX.yml with new analysis task Hallazgos críticos: - E-COH-001 to E-COH-007: Coherence gaps (6.5h to fix) - BLOCKER-001 to 004: Token refresh, PCI-DSS, Video upload, MT4 Gateway (380h) - Documentation gaps: 8 ET specs, 8 US, 34 Swagger docs (47.5h) Roadmap phases: - Q1: Security & Blockers (249h) - Q2: Core Features + GO-LIVE (542h) - Q3: Scalability & Performance (380h) - Q4: Innovation & Advanced Features (1,514h) ROI: $223k investment → $750k revenue → $468k net profit (165% ROI) Next: Execute ST1 (Coherencia Fixes P0) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
580 lines
20 KiB
Markdown
580 lines
20 KiB
Markdown
---
|
|
id: "ET-ML-012"
|
|
title: "VBP (Volatility Breakout Predictor) Strategy"
|
|
type: "Technical Specification"
|
|
status: "Approved"
|
|
priority: "Alta"
|
|
epic: "OQI-006"
|
|
project: "trading-platform"
|
|
version: "1.0.0"
|
|
created_date: "2026-01-25"
|
|
updated_date: "2026-01-25"
|
|
task_reference: "TASK-2026-01-25-ML-TRAINING-ENHANCEMENT"
|
|
---
|
|
|
|
# ET-ML-012: VBP (Volatility Breakout Predictor) Strategy
|
|
|
|
## Metadata
|
|
|
|
| Campo | Valor |
|
|
|-------|-------|
|
|
| **ID** | ET-ML-012 |
|
|
| **Epica** | OQI-006 - Senales ML |
|
|
| **Tipo** | Especificacion Tecnica |
|
|
| **Version** | 1.0.0 |
|
|
| **Estado** | Aprobado |
|
|
| **Ultima actualizacion** | 2026-01-25 |
|
|
| **Tarea Referencia** | TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |
|
|
|
|
---
|
|
|
|
## Resumen
|
|
|
|
La estrategia VBP (Volatility Breakout Predictor) predice rupturas de volatilidad inminentes usando una arquitectura de **CNN 1D** con **Attention** y **XGBoost**. El modelo se especializa en detectar compresiones de volatilidad que preceden a movimientos explosivos.
|
|
|
|
### Caracteristicas Clave
|
|
|
|
- **CNN 1D + Attention**: Extrae patrones locales y relaciones globales
|
|
- **Features de Volatilidad**: ATR, Bollinger Bands, Keltner Channels
|
|
- **Deteccion de Squeeze**: Identifica compresiones de volatilidad
|
|
- **Balanced Sampling**: 3x oversample de eventos de breakout
|
|
|
|
---
|
|
|
|
## Arquitectura
|
|
|
|
### Diagrama de Alto Nivel
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ VBP MODEL │
|
|
├─────────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ Input: OHLCV Sequence (seq_len x n_features) │
|
|
│ └── Volatility features (ATR, BB, Keltner) │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
|
│ │ CNN 1D BACKBONE │ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
|
|
│ │ │ Conv1D │ │ Conv1D │ │ Conv1D │ │ │
|
|
│ │ │ 64 filters │──▶│ 128 filters │──▶│ 256 filters │ │ │
|
|
│ │ │ kernel=7 │ │ kernel=5 │ │ kernel=3 │ │ │
|
|
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
|
|
│ │ │ │ │ │ │
|
|
│ │ └────────────────┼────────────────┘ │ │
|
|
│ │ ▼ │ │
|
|
│ │ ┌─────────────────────┐ │ │
|
|
│ │ │ Multi-Scale Concat │ │ │
|
|
│ │ └─────────────────────┘ │ │
|
|
│ └───────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
|
│ │ ATTENTION LAYER │ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────────┐ ┌─────────────────────────────────────┐ │ │
|
|
│ │ │ Self-Attention │────▶│ Temporal Attention Pooling │ │ │
|
|
│ │ │ (4 heads) │ │ (weighted avg over sequence) │ │ │
|
|
│ │ └─────────────────┘ └─────────────────────────────────────┘ │ │
|
|
│ └───────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
|
│ │ XGBOOST HEAD │ │
|
|
│ │ │ │
|
|
│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
|
|
│ │ │ Breakout │ │ Breakout │ │ │
|
|
│ │ │ Classifier │ │ Direction │ │ │
|
|
│ │ │ (binary) │ │ Classifier │ │ │
|
|
│ │ └──────────────────┘ └──────────────────┘ │ │
|
|
│ └───────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ Output: VBPPrediction │
|
|
│ - breakout_prob: float (0 to 1) │
|
|
│ - breakout_direction: 'up' | 'down' | 'none' │
|
|
│ - squeeze_intensity: float │
|
|
│ - expected_magnitude: float │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### CNN 1D Configuration
|
|
|
|
```python
|
|
CNNConfig:
|
|
conv_channels: [64, 128, 256]
|
|
kernel_sizes: [7, 5, 3]
|
|
pool_sizes: [2, 2, 2]
|
|
activation: 'gelu'
|
|
batch_norm: True
|
|
dropout: 0.2
|
|
```
|
|
|
|
| Layer | Filters | Kernel | Output |
|
|
|-------|---------|--------|--------|
|
|
| Conv1D_1 | 64 | 7 | Local patterns |
|
|
| Conv1D_2 | 128 | 5 | Mid-range patterns |
|
|
| Conv1D_3 | 256 | 3 | High-level abstractions |
|
|
|
|
### Attention Configuration
|
|
|
|
| Parametro | Valor |
|
|
|-----------|-------|
|
|
| **n_heads** | 4 |
|
|
| **attention_dim** | 128 |
|
|
| **dropout** | 0.1 |
|
|
| **pooling** | Temporal weighted average |
|
|
|
|
---
|
|
|
|
## Feature Engineering
|
|
|
|
### Volatility Features
|
|
|
|
#### 1. ATR (Average True Range)
|
|
|
|
```python
|
|
ATR Features:
|
|
atr_14: ATR(14)
|
|
atr_7: ATR(7)
|
|
atr_28: ATR(28)
|
|
atr_ratio: atr_7 / atr_28
|
|
atr_percentile: rolling percentile of ATR
|
|
normalized_atr: atr / close
|
|
```
|
|
|
|
| Feature | Descripcion | Uso |
|
|
|---------|-------------|-----|
|
|
| `atr_14` | ATR estandar | Volatilidad base |
|
|
| `atr_ratio` | Short/Long ATR | Expansion/contraccion |
|
|
| `normalized_atr` | ATR como % del precio | Comparacion entre simbolos |
|
|
|
|
#### 2. Bollinger Bands
|
|
|
|
```python
|
|
BB Features:
|
|
bb_upper: SMA(20) + 2*std
|
|
bb_lower: SMA(20) - 2*std
|
|
bb_width: (upper - lower) / middle
|
|
bb_width_percentile: percentile(bb_width, 100)
|
|
bb_squeeze: bb_width < percentile_20
|
|
bb_expansion: bb_width > percentile_80
|
|
bb_position: (close - lower) / (upper - lower)
|
|
```
|
|
|
|
| Feature | Formula | Interpretacion |
|
|
|---------|---------|----------------|
|
|
| `bb_width` | `(upper - lower) / middle` | Ancho relativo |
|
|
| `bb_squeeze` | `bb_width < percentile(20)` | Compresion de volatilidad |
|
|
| `bb_position` | Position 0-1 en bandas | Sobrecompra/sobreventa |
|
|
|
|
#### 3. Keltner Channels
|
|
|
|
```python
|
|
Keltner Features:
|
|
kc_upper: EMA(20) + 2*ATR(10)
|
|
kc_lower: EMA(20) - 2*ATR(10)
|
|
kc_width: (upper - lower) / middle
|
|
kc_squeeze: bb_lower > kc_lower AND bb_upper < kc_upper
|
|
kc_position: (close - lower) / (upper - lower)
|
|
```
|
|
|
|
#### 4. Squeeze Detection
|
|
|
|
```python
|
|
Squeeze Features:
|
|
is_squeeze: BB inside Keltner
|
|
squeeze_length: consecutive squeeze bars
|
|
squeeze_momentum: Rate of change during squeeze
|
|
squeeze_release: first bar after squeeze ends
|
|
squeeze_intensity: bb_width / kc_width
|
|
```
|
|
|
|
| Estado | Condicion | Significado |
|
|
|--------|-----------|-------------|
|
|
| **SQUEEZE ON** | BB inside KC | Compresion activa |
|
|
| **SQUEEZE OFF** | BB outside KC | Expansion iniciada |
|
|
| **SQUEEZE RELEASE** | Transicion | Momento de breakout |
|
|
|
|
### Feature Summary
|
|
|
|
```python
|
|
VBPFeatureConfig:
|
|
atr_periods: [7, 14, 28]
|
|
bb_period: 20
|
|
bb_std: 2.0
|
|
kc_period: 20
|
|
kc_atr_mult: 1.5
|
|
squeeze_lookback: 20
|
|
feature_count: 45 # Total features
|
|
```
|
|
|
|
---
|
|
|
|
## Balanced Sampling
|
|
|
|
### Problema: Clase Desbalanceada
|
|
|
|
Los eventos de breakout son raros (~5-10% de los datos), causando:
|
|
- Modelo sesgado hacia clase mayoritaria
|
|
- Bajo recall en breakouts
|
|
- Metricas enganosas
|
|
|
|
### Solucion: 3x Oversample
|
|
|
|
```python
|
|
class BalancedSampler:
|
|
"""Oversample eventos de breakout 3x"""
|
|
|
|
def __init__(self, oversample_factor: float = 3.0):
|
|
self.oversample_factor = oversample_factor
|
|
|
|
def fit_resample(self, X: np.ndarray, y: np.ndarray):
|
|
# Identificar breakouts (clase minoritaria)
|
|
breakout_mask = y == 1
|
|
n_breakouts = breakout_mask.sum()
|
|
n_normal = (~breakout_mask).sum()
|
|
|
|
# Calcular samples adicionales
|
|
target_breakouts = int(n_breakouts * self.oversample_factor)
|
|
additional_samples = target_breakouts - n_breakouts
|
|
|
|
# Resampling con reemplazo
|
|
breakout_indices = np.where(breakout_mask)[0]
|
|
resampled_indices = np.random.choice(
|
|
breakout_indices,
|
|
size=additional_samples,
|
|
replace=True
|
|
)
|
|
|
|
# Combinar
|
|
X_resampled = np.vstack([X, X[resampled_indices]])
|
|
y_resampled = np.concatenate([y, y[resampled_indices]])
|
|
|
|
return X_resampled, y_resampled
|
|
```
|
|
|
|
### Alternativas Consideradas
|
|
|
|
| Metodo | Pro | Contra | Seleccion |
|
|
|--------|-----|--------|-----------|
|
|
| **SMOTE** | Genera nuevos samples | Puede crear ruido | No |
|
|
| **Class Weights** | Simple | Menos efectivo para desbalance severo | Complemento |
|
|
| **Oversample 3x** | Robusto, mantiene distribucion real | Riesgo de overfitting | **SI** |
|
|
|
|
### Configuracion XGBoost
|
|
|
|
```python
|
|
xgb_params = {
|
|
'n_estimators': 300,
|
|
'max_depth': 5,
|
|
'learning_rate': 0.05,
|
|
'scale_pos_weight': 3, # Complementa el oversample
|
|
'subsample': 0.8,
|
|
'colsample_bytree': 0.7
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Deteccion de Breakout
|
|
|
|
### Definicion de Breakout
|
|
|
|
Un **breakout** se define cuando:
|
|
|
|
1. **Squeeze activo** durante al menos N candles
|
|
2. **Precio rompe** upper/lower band
|
|
3. **Volumen** por encima del promedio
|
|
4. **Movimiento confirmado** de X%
|
|
|
|
```python
|
|
def label_breakouts(
|
|
df: pd.DataFrame,
|
|
squeeze_min_length: int = 6,
|
|
breakout_threshold: float = 0.015, # 1.5%
|
|
forward_bars: int = 12
|
|
) -> np.ndarray:
|
|
"""Etiqueta breakouts para entrenamiento"""
|
|
labels = np.zeros(len(df))
|
|
|
|
for i in range(len(df) - forward_bars):
|
|
# Verificar squeeze activo
|
|
if df['squeeze_length'].iloc[i] >= squeeze_min_length:
|
|
# Verificar breakout en siguiente periodo
|
|
future_high = df['high'].iloc[i+1:i+forward_bars+1].max()
|
|
future_low = df['low'].iloc[i+1:i+forward_bars+1].min()
|
|
|
|
up_move = (future_high - df['close'].iloc[i]) / df['close'].iloc[i]
|
|
down_move = (df['close'].iloc[i] - future_low) / df['close'].iloc[i]
|
|
|
|
if up_move >= breakout_threshold:
|
|
labels[i] = 1 # Breakout Up
|
|
elif down_move >= breakout_threshold:
|
|
labels[i] = -1 # Breakout Down
|
|
|
|
return labels
|
|
```
|
|
|
|
### Tipos de Breakout
|
|
|
|
| Tipo | Condicion | Accion Sugerida |
|
|
|------|-----------|-----------------|
|
|
| **BREAKOUT_UP** | Rompe upper band con volumen | Long |
|
|
| **BREAKOUT_DOWN** | Rompe lower band con volumen | Short |
|
|
| **FALSE_BREAKOUT** | Rompe pero revierte rapidamente | Fade |
|
|
| **NO_BREAKOUT** | Squeeze sin resolucion | Wait |
|
|
|
|
---
|
|
|
|
## Pipeline de Entrenamiento
|
|
|
|
### Fase 1: Preparacion de Datos
|
|
|
|
```python
|
|
# 1. Calcular features de volatilidad
|
|
features = vbp_feature_engineer.compute_volatility_features(df)
|
|
|
|
# 2. Etiquetar breakouts
|
|
labels = label_breakouts(df, squeeze_min_length=6)
|
|
|
|
# 3. Aplicar balanced sampling
|
|
X_balanced, y_balanced = balanced_sampler.fit_resample(X, y)
|
|
|
|
print(f"Original: {len(X)} samples, {(y==1).sum()} breakouts")
|
|
print(f"Balanced: {len(X_balanced)} samples, {(y_balanced==1).sum()} breakouts")
|
|
```
|
|
|
|
### Fase 2: Entrenamiento CNN
|
|
|
|
```python
|
|
# Entrenar CNN + Attention
|
|
cnn_model = VBPCNNModel(config)
|
|
cnn_model.fit(
|
|
X_train_sequences,
|
|
y_train,
|
|
epochs=50,
|
|
batch_size=64,
|
|
validation_data=(X_val, y_val)
|
|
)
|
|
|
|
# Extraer features del backbone
|
|
cnn_features = cnn_model.extract_features(X_train)
|
|
```
|
|
|
|
### Fase 3: Entrenamiento XGBoost
|
|
|
|
```python
|
|
# Combinar CNN features con features originales
|
|
combined_features = np.concatenate([
|
|
cnn_features,
|
|
volatility_features,
|
|
squeeze_features
|
|
], axis=1)
|
|
|
|
# Entrenar clasificador de breakout
|
|
xgb_breakout = XGBClassifier(**xgb_params)
|
|
xgb_breakout.fit(combined_features, breakout_labels)
|
|
|
|
# Entrenar clasificador de direccion (solo en breakouts)
|
|
breakout_mask = breakout_labels != 0
|
|
xgb_direction = XGBClassifier(**xgb_params)
|
|
xgb_direction.fit(
|
|
combined_features[breakout_mask],
|
|
direction_labels[breakout_mask]
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Metricas de Evaluacion
|
|
|
|
### Metricas de Breakout Detection
|
|
|
|
| Metrica | Descripcion | Target |
|
|
|---------|-------------|--------|
|
|
| **Breakout Recall** | Deteccion de breakouts reales | >= 70% |
|
|
| **Breakout Precision** | Breakouts predichos que ocurren | >= 50% |
|
|
| **F1 Score** | Balance precision/recall | >= 0.55 |
|
|
| **False Positive Rate** | Falsas alarmas | < 15% |
|
|
|
|
### Metricas de Direccion
|
|
|
|
| Metrica | Descripcion | Target |
|
|
|---------|-------------|--------|
|
|
| **Direction Accuracy** | Direccion correcta cuando hay breakout | >= 65% |
|
|
| **Timing Error** | Error en candles hasta breakout | < 3 candles |
|
|
|
|
### Metricas de Trading
|
|
|
|
| Metrica | Descripcion | Target |
|
|
|---------|-------------|--------|
|
|
| **Profit Factor** | Gross profit / Gross loss | > 1.5 |
|
|
| **Win Rate** | Trades ganadores / Total | > 55% |
|
|
| **Avg Win/Loss Ratio** | Ganancia promedio / Perdida promedio | > 1.2 |
|
|
|
|
---
|
|
|
|
## API y Uso
|
|
|
|
### Clase Principal: VBPModel
|
|
|
|
```python
|
|
from models.strategies.vbp import VBPModel, VBPConfig
|
|
|
|
# Configuracion
|
|
config = VBPConfig(
|
|
conv_channels=[64, 128, 256],
|
|
attention_heads=4,
|
|
sequence_length=60,
|
|
xgb_n_estimators=300,
|
|
oversample_factor=3.0
|
|
)
|
|
|
|
# Inicializar modelo
|
|
model = VBPModel(config)
|
|
|
|
# Entrenar
|
|
metrics = model.fit(df_train, df_val)
|
|
|
|
# Prediccion
|
|
predictions = model.predict(df_new)
|
|
for pred in predictions:
|
|
print(f"Breakout Prob: {pred.breakout_prob:.2%}")
|
|
print(f"Direction: {pred.breakout_direction}")
|
|
print(f"Squeeze Intensity: {pred.squeeze_intensity:.2f}")
|
|
```
|
|
|
|
### Clase VBPPrediction
|
|
|
|
```python
|
|
@dataclass
|
|
class VBPPrediction:
|
|
breakout_prob: float # Probabilidad de breakout
|
|
breakout_direction: str # 'up', 'down', 'none'
|
|
direction_confidence: float
|
|
squeeze_intensity: float # Intensidad del squeeze actual
|
|
squeeze_length: int # Candles en squeeze
|
|
expected_magnitude: float # Magnitud esperada del breakout
|
|
time_to_breakout: int # Candles estimados hasta breakout
|
|
|
|
@property
|
|
def is_high_probability(self) -> bool:
|
|
return self.breakout_prob > 0.7
|
|
|
|
@property
|
|
def signal(self) -> Optional[str]:
|
|
if self.breakout_prob > 0.7 and self.direction_confidence > 0.6:
|
|
return f"BREAKOUT_{self.breakout_direction.upper()}"
|
|
elif self.squeeze_intensity > 0.8:
|
|
return "SQUEEZE_ALERT"
|
|
return None
|
|
```
|
|
|
|
### Alertas de Squeeze
|
|
|
|
```python
|
|
class SqueezeAlert:
|
|
"""Sistema de alertas para squeezes y breakouts"""
|
|
|
|
def check_alerts(self, prediction: VBPPrediction) -> List[Dict]:
|
|
alerts = []
|
|
|
|
# Alerta de squeeze intenso
|
|
if prediction.squeeze_intensity > 0.9:
|
|
alerts.append({
|
|
'type': 'SQUEEZE_EXTREME',
|
|
'severity': 'HIGH',
|
|
'message': f'Squeeze extremo detectado ({prediction.squeeze_length} bars)'
|
|
})
|
|
|
|
# Alerta de breakout inminente
|
|
if prediction.breakout_prob > 0.8:
|
|
alerts.append({
|
|
'type': 'BREAKOUT_IMMINENT',
|
|
'severity': 'CRITICAL',
|
|
'message': f'Breakout {prediction.breakout_direction} probable'
|
|
})
|
|
|
|
return alerts
|
|
```
|
|
|
|
---
|
|
|
|
## Estructura de Archivos
|
|
|
|
```
|
|
apps/ml-engine/src/models/strategies/vbp/
|
|
├── __init__.py
|
|
├── model.py # VBPModel, VBPConfig, VBPPrediction
|
|
├── feature_engineering.py # VBPFeatureEngineer (volatility features)
|
|
├── cnn_backbone.py # CNN 1D + Attention architecture
|
|
├── balanced_sampler.py # Oversample implementation
|
|
└── trainer.py # VBPTrainer
|
|
```
|
|
|
|
---
|
|
|
|
## Consideraciones de Produccion
|
|
|
|
### Deteccion en Tiempo Real
|
|
|
|
```python
|
|
class VBPRealtime:
|
|
def __init__(self, model_path: str):
|
|
self.model = VBPModel.load(model_path)
|
|
self.squeeze_tracker = SqueezeTracker()
|
|
|
|
def on_candle(self, candle: Dict) -> Optional[VBPPrediction]:
|
|
# Actualizar tracker
|
|
self.squeeze_tracker.update(candle)
|
|
|
|
# Solo predecir si hay squeeze activo
|
|
if self.squeeze_tracker.is_squeeze:
|
|
features = self.compute_features()
|
|
return self.model.predict_single(features)
|
|
|
|
return None
|
|
```
|
|
|
|
### Configuracion de Alertas
|
|
|
|
```yaml
|
|
# config/vbp_alerts.yml
|
|
alerts:
|
|
squeeze_alert:
|
|
min_intensity: 0.8
|
|
min_length: 6
|
|
channels: ['telegram', 'webhook']
|
|
|
|
breakout_alert:
|
|
min_probability: 0.75
|
|
min_direction_confidence: 0.6
|
|
channels: ['telegram', 'webhook', 'email']
|
|
```
|
|
|
|
### Performance Benchmarks
|
|
|
|
| Operacion | Tiempo | Batch Size |
|
|
|-----------|--------|------------|
|
|
| Feature computation | 5ms | 1 |
|
|
| CNN inference | 10ms | 1 |
|
|
| Full prediction | 20ms | 1 |
|
|
| Bulk inference | 500ms | 1000 |
|
|
|
|
---
|
|
|
|
## Referencias
|
|
|
|
- [ET-ML-001: Arquitectura ML Engine](./ET-ML-001-arquitectura.md)
|
|
- [ET-ML-010: PVA Strategy](./ET-ML-010-pva-strategy.md)
|
|
- [Bollinger Bands](https://school.stockcharts.com/doku.php?id=technical_indicators:bollinger_bands)
|
|
- [TTM Squeeze Indicator](https://www.tradingview.com/scripts/ttmsqueeze/)
|
|
|
|
---
|
|
|
|
**Autor:** ML-Specialist (NEXUS v4.0)
|
|
**Fecha:** 2026-01-25
|