trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/especificaciones/ET-ML-010-pva-strategy.md
Adrian Flores Cortes f1174723ed feat: Add comprehensive analysis and integration plan for trading-platform
- Created TASK-2026-01-26-ANALYSIS-INTEGRATION-PLAN with complete CAPVED documentation
- Orchestrated 5 specialized Explore agents in parallel (85% time reduction)
- Identified 7 coherence gaps (DDL↔Backend↔Frontend)
- Identified 4 P0 blockers preventing GO-LIVE
- Documented 58 missing documentation items
- Created detailed roadmap Q1-Q4 2026 (2,500h total)
- Added 6 new ET specs for ML strategies (PVA, MRD, VBP, MSA, MTS, Backtesting)
- Updated _INDEX.yml with new analysis task

Hallazgos críticos:
- E-COH-001 to E-COH-007: Coherence gaps (6.5h to fix)
- BLOCKER-001 to 004: Token refresh, PCI-DSS, Video upload, MT4 Gateway (380h)
- Documentation gaps: 8 ET specs, 8 US, 34 Swagger docs (47.5h)

Roadmap phases:
- Q1: Security & Blockers (249h)
- Q2: Core Features + GO-LIVE (542h)
- Q3: Scalability & Performance (380h)
- Q4: Innovation & Advanced Features (1,514h)

ROI: $223k investment → $750k revenue → $468k net profit (165% ROI)

Next: Execute ST1 (Coherencia Fixes P0)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:40:56 -06:00

14 KiB

id title type status priority epic project version created_date updated_date task_reference
ET-ML-010 PVA (Price Variation Attention) Strategy Technical Specification Approved Alta OQI-006 trading-platform 1.0.0 2026-01-25 2026-01-25 TASK-2026-01-25-ML-TRAINING-ENHANCEMENT

ET-ML-010: PVA (Price Variation Attention) Strategy

Metadata

Campo Valor
ID ET-ML-010
Epica OQI-006 - Senales ML
Tipo Especificacion Tecnica
Version 1.0.0
Estado Aprobado
Ultima actualizacion 2026-01-25
Tarea Referencia TASK-2026-01-25-ML-TRAINING-ENHANCEMENT

Resumen

La estrategia PVA (Price Variation Attention) es un modelo hibrido que combina un Transformer Encoder para aprendizaje de representaciones y XGBoost para predicciones finales. El modelo predice la direccion y magnitud de variaciones de precio en un horizonte futuro.

Caracteristicas Clave

  • Diseno Time-Agnostic: No usa features temporales (hora, dia) para evitar sobreajuste
  • 6 Modelos Independientes: Un modelo por simbolo (XAUUSD, EURUSD, GBPUSD, USDJPY, BTCUSD, ETHUSD)
  • Arquitectura Hibrida: Transformer Encoder + XGBoost Head
  • Prediction Targets: Direccion (bullish/bearish) y magnitud del movimiento

Arquitectura

Diagrama de Alto Nivel

┌─────────────────────────────────────────────────────────────────────────┐
│                        PVA MODEL                                         │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Input: OHLCV Sequence (seq_len x n_features)                           │
│         └── Returns, Acceleration, Volatility features                  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                  TRANSFORMER ENCODER                               │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐            │  │
│  │  │ Input Linear │──▶│ Positional   │──▶│ Encoder      │           │  │
│  │  │ Projection   │  │ Encoding     │  │ Layers (4)   │           │  │
│  │  └──────────────┘  └──────────────┘  └──────────────┘            │  │
│  │                                              │                     │  │
│  │                                              ▼                     │  │
│  │                                    ┌──────────────────┐           │  │
│  │                                    │ Sequence Pooling │           │  │
│  │                                    │ (Mean + Last)    │           │  │
│  │                                    └──────────────────┘           │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                    XGBOOST HEAD                                    │  │
│  │  ┌──────────────────────┐    ┌──────────────────────┐            │  │
│  │  │ Direction Classifier │    │ Magnitude Regressor  │            │  │
│  │  │ (binary: up/down)    │    │ (absolute magnitude) │            │  │
│  │  └──────────────────────┘    └──────────────────────┘            │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  Output: PVAPrediction                                                   │
│          - direction: float (-1 to 1)                                    │
│          - magnitude: float (absolute expected move)                     │
│          - confidence: float (0 to 1)                                    │
└─────────────────────────────────────────────────────────────────────────┘

Componentes del Transformer Encoder

Componente Configuracion
Layers 4 encoder layers
d_model 256
n_heads 8 attention heads
d_ff 1024 (feed-forward dimension)
dropout 0.1
positional_encoding Sinusoidal
sequence_length 100 candles

Configuracion XGBoost

Parametro Valor
n_estimators 200
max_depth 6
learning_rate 0.05
subsample 0.8
colsample_bytree 0.8
reg_alpha 0.1
reg_lambda 1.0

Feature Engineering

Diseno Time-Agnostic

El modelo no utiliza features temporales (hora del dia, dia de la semana) para:

  • Evitar sobreajuste a patrones temporales especificos
  • Mejorar generalizacion a diferentes condiciones de mercado
  • Reducir el riesgo de concept drift

Features Implementados

PVAFeatureConfig:
    return_periods: [1, 5, 10, 20]
    volatility_window: 20
    stats_window: 50
    sequence_length: 100

1. Return Features

Feature Descripcion Formula
return_1 Return 1 periodo (close / close.shift(1)) - 1
return_5 Return 5 periodos (close / close.shift(5)) - 1
return_10 Return 10 periodos (close / close.shift(10)) - 1
return_20 Return 20 periodos (close / close.shift(20)) - 1

2. Acceleration Features

Feature Descripcion Formula
acceleration_1 Cambio en momentum corto return_1 - return_1.shift(1)
acceleration_5 Cambio en momentum medio return_5 - return_5.shift(5)
acceleration_20 Cambio en momentum largo return_20 - return_20.shift(20)

3. Volatility Features

Feature Descripcion Formula
volatility_returns Volatilidad de returns return_1.rolling(20).std()
volatility_ratio Ratio volatilidad actual/promedio volatility / volatility.rolling(50).mean()
range_volatility Volatilidad de rangos ((high - low) / close).rolling(20).std()

4. Statistical Features

Feature Descripcion Formula
return_skew Sesgo de returns return_1.rolling(50).skew()
return_kurt Curtosis de returns return_1.rolling(50).kurt()
zscore_return Z-score del return (return_1 - mean) / std

Pipeline de Entrenamiento

Flujo de Entrenamiento

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Data      │───▶│  Feature    │───▶│  Encoder    │───▶│  XGBoost    │
│  Loading    │    │ Engineering │    │  Training   │    │  Training   │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                                             │
                                             ▼
                                    ┌─────────────────┐
                                    │ Validation &    │
                                    │ Model Saving    │
                                    └─────────────────┘

Configuracion del Trainer

PVATrainerConfig:
    # Data
    timeframe: '5m'
    batch_size: 64
    sequence_length: 100
    target_horizon: 12  # candles ahead

    # Training
    encoder_epochs: 50
    encoder_learning_rate: 1e-4
    early_stopping_patience: 10

    # Validation
    val_ratio: 0.15
    walk_forward_splits: 5
    min_train_size: 10000

Walk-Forward Validation

El modelo utiliza walk-forward validation para evaluar rendimiento:

Time ──────────────────────────────────────────────────────────────▶

Fold 1: [========= TRAIN =========][TEST]
Fold 2: [============= TRAIN =============][TEST]
Fold 3: [================= TRAIN =================][TEST]
Fold 4: [===================== TRAIN =====================][TEST]
Fold 5: [========================= TRAIN =========================][TEST]

Caracteristicas:

  • Expanding window (ventana creciente)
  • 5 folds por defecto
  • Gap opcional entre train y test
  • Metricas agregadas por fold

Metricas de Evaluacion

Metricas Primarias

Metrica Descripcion Target
Direction Accuracy Precision en direccion (up/down) >= 55%
Magnitude MAE Error absoluto medio en magnitud Minimo
Directional Return Return promedio considerando direccion > 0
Sharpe Proxy mean(signed_returns) / std(signed_returns) > 1.0

Metricas Secundarias

Metrica Descripcion
Encoder Loss MSE del autoencoder
Confidence Calibration Alineacion confianza vs accuracy
Per-Symbol Performance Metricas desglosadas por simbolo

API y Uso

Clase Principal: PVAModel

from models.strategies.pva import PVAModel, PVAConfig

# Configuracion
config = PVAConfig(
    input_features=15,
    sequence_length=100,
    d_model=256,
    n_heads=8,
    n_layers=4,
    d_ff=1024,
    dropout=0.1,
    device='cuda'
)

# Inicializar modelo
model = PVAModel(config)

# Entrenar encoder
history = model.fit_encoder(
    X_train, y_train,
    X_val, y_val,
    epochs=50,
    batch_size=64
)

# Entrenar XGBoost
metrics = model.fit_xgboost(X_train, y_train, X_val, y_val)

# Prediccion
predictions = model.predict(X_new)
for pred in predictions:
    print(f"Direction: {pred.direction}, Magnitude: {pred.magnitude}")

Clase PVAPrediction

@dataclass
class PVAPrediction:
    direction: float       # -1 to 1 (bearish to bullish)
    magnitude: float       # Expected absolute move
    confidence: float      # 0 to 1
    encoder_features: np.ndarray  # Latent representation

    @property
    def expected_return(self) -> float:
        return self.direction * self.magnitude

    @property
    def signal_strength(self) -> float:
        return abs(self.direction) * self.confidence

Clase PVATrainer

from models.strategies.pva import PVATrainer, PVATrainerConfig

# Configurar trainer
config = PVATrainerConfig(
    timeframe='5m',
    sequence_length=100,
    target_horizon=12,
    encoder_epochs=50
)

trainer = PVATrainer(config)

# Entrenar para un simbolo
model, metrics = trainer.train(
    symbol='XAUUSD',
    start_date='2023-01-01',
    end_date='2024-12-31'
)

# Walk-forward validation
results = trainer.walk_forward_train('XAUUSD', n_folds=5)
print(f"Avg Direction Accuracy: {results.avg_direction_accuracy:.2%}")

# Guardar modelo
trainer.save_model(model, 'XAUUSD', 'v1.0.0')

Estructura de Archivos

apps/ml-engine/src/models/strategies/pva/
├── __init__.py
├── model.py              # PVAModel, PVAConfig, PVAPrediction
├── feature_engineering.py # PVAFeatureEngineer, PVAFeatureConfig
├── trainer.py            # PVATrainer, TrainingMetrics
└── attention.py          # PriceVariationAttention encoder

Consideraciones de Produccion

GPU Acceleration

# Deteccion automatica de GPU
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = PVAModel(config, device=device)

# XGBoost con GPU
xgb_params = {
    'tree_method': 'gpu_hist',
    'device': 'cuda'
}

Model Versioning

models/pva/{symbol}/{version}/
├── encoder.pt           # PyTorch encoder weights
├── xgb_direction.joblib # XGBoost direction classifier
├── xgb_magnitude.joblib # XGBoost magnitude regressor
├── config.json          # Model configuration
├── metadata.json        # Training metadata
└── feature_names.json   # Feature column names

Inference Batch Size

Escenario Batch Size Recomendado
Real-time single 1
Backtesting 256
Bulk inference 1024

Referencias


Autor: ML-Specialist (NEXUS v4.0) Fecha: 2026-01-25