trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/especificaciones/ET-ML-010-pva-strategy.md

---
id: "ET-ML-010"
title: "PVA (Price Variation Attention) Strategy"
type: "Technical Specification"
status: "Approved"
priority: "Alta"
epic: "OQI-006"
project: "trading-platform"
version: "1.0.0"
created_date: "2026-01-25"
updated_date: "2026-01-25"
task_reference: "TASK-2026-01-25-ML-TRAINING-ENHANCEMENT"
---

# ET-ML-010: PVA (Price Variation Attention) Strategy

## Metadata

| Campo | Valor |
|-------|-------|
| **ID** | ET-ML-010 |
| **Epica** | OQI-006 - Senales ML |
| **Tipo** | Especificacion Tecnica |
| **Version** | 1.0.0 |
| **Estado** | Aprobado |
| **Ultima actualizacion** | 2026-01-25 |
| **Tarea Referencia** | TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |

---

## Resumen

La estrategia PVA (Price Variation Attention) es un modelo hibrido que combina un **Transformer Encoder** para aprendizaje de representaciones y **XGBoost** para predicciones finales. El modelo predice la **direccion** y **magnitud** de variaciones de precio en un horizonte futuro.

### Caracteristicas Clave

- **Diseno Time-Agnostic**: No usa features temporales (hora, dia) para evitar sobreajuste
- **6 Modelos Independientes**: Un modelo por simbolo (XAUUSD, EURUSD, GBPUSD, USDJPY, BTCUSD, ETHUSD)
- **Arquitectura Hibrida**: Transformer Encoder + XGBoost Head
- **Prediction Targets**: Direccion (bullish/bearish) y magnitud del movimiento

---

## Arquitectura

### Diagrama de Alto Nivel

```
┌─────────────────────────────────────────────────────────────────────────┐
│                        PVA MODEL                                         │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Input: OHLCV Sequence (seq_len x n_features)                           │
│         └── Returns, Acceleration, Volatility features                  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                  TRANSFORMER ENCODER                               │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐            │  │
│  │  │ Input Linear │──▶│ Positional   │──▶│ Encoder      │           │  │
│  │  │ Projection   │  │ Encoding     │  │ Layers (4)   │           │  │
│  │  └──────────────┘  └──────────────┘  └──────────────┘            │  │
│  │                                              │                     │  │
│  │                                              ▼                     │  │
│  │                                    ┌──────────────────┐           │  │
│  │                                    │ Sequence Pooling │           │  │
│  │                                    │ (Mean + Last)    │           │  │
│  │                                    └──────────────────┘           │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                    XGBOOST HEAD                                    │  │
│  │  ┌──────────────────────┐    ┌──────────────────────┐            │  │
│  │  │ Direction Classifier │    │ Magnitude Regressor  │            │  │
│  │  │ (binary: up/down)    │    │ (absolute magnitude) │            │  │
│  │  └──────────────────────┘    └──────────────────────┘            │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  Output: PVAPrediction                                                   │
│          - direction: float (-1 to 1)                                    │
│          - magnitude: float (absolute expected move)                     │
│          - confidence: float (0 to 1)                                    │
└─────────────────────────────────────────────────────────────────────────┘
```

### Componentes del Transformer Encoder

| Componente | Configuracion |
|------------|---------------|
| **Layers** | 4 encoder layers |
| **d_model** | 256 |
| **n_heads** | 8 attention heads |
| **d_ff** | 1024 (feed-forward dimension) |
| **dropout** | 0.1 |
| **positional_encoding** | Sinusoidal |
| **sequence_length** | 100 candles |

### Configuracion XGBoost

| Parametro | Valor |
|-----------|-------|
| **n_estimators** | 200 |
| **max_depth** | 6 |
| **learning_rate** | 0.05 |
| **subsample** | 0.8 |
| **colsample_bytree** | 0.8 |
| **reg_alpha** | 0.1 |
| **reg_lambda** | 1.0 |

---

## Feature Engineering

### Diseno Time-Agnostic

El modelo **no utiliza features temporales** (hora del dia, dia de la semana) para:
- Evitar sobreajuste a patrones temporales especificos
- Mejorar generalizacion a diferentes condiciones de mercado
- Reducir el riesgo de concept drift

### Features Implementados

```python
PVAFeatureConfig:
    return_periods: [1, 5, 10, 20]
    volatility_window: 20
    stats_window: 50
    sequence_length: 100
```

#### 1. Return Features

| Feature | Descripcion | Formula |
|---------|-------------|---------|
| `return_1` | Return 1 periodo | `(close / close.shift(1)) - 1` |
| `return_5` | Return 5 periodos | `(close / close.shift(5)) - 1` |
| `return_10` | Return 10 periodos | `(close / close.shift(10)) - 1` |
| `return_20` | Return 20 periodos | `(close / close.shift(20)) - 1` |

#### 2. Acceleration Features

| Feature | Descripcion | Formula |
|---------|-------------|---------|
| `acceleration_1` | Cambio en momentum corto | `return_1 - return_1.shift(1)` |
| `acceleration_5` | Cambio en momentum medio | `return_5 - return_5.shift(5)` |
| `acceleration_20` | Cambio en momentum largo | `return_20 - return_20.shift(20)` |

#### 3. Volatility Features

| Feature | Descripcion | Formula |
|---------|-------------|---------|
| `volatility_returns` | Volatilidad de returns | `return_1.rolling(20).std()` |
| `volatility_ratio` | Ratio volatilidad actual/promedio | `volatility / volatility.rolling(50).mean()` |
| `range_volatility` | Volatilidad de rangos | `((high - low) / close).rolling(20).std()` |

#### 4. Statistical Features

| Feature | Descripcion | Formula |
|---------|-------------|---------|
| `return_skew` | Sesgo de returns | `return_1.rolling(50).skew()` |
| `return_kurt` | Curtosis de returns | `return_1.rolling(50).kurt()` |
| `zscore_return` | Z-score del return | `(return_1 - mean) / std` |

---

## Pipeline de Entrenamiento

### Flujo de Entrenamiento

```
┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Data      │───▶│  Feature    │───▶│  Encoder    │───▶│  XGBoost    │
│  Loading    │    │ Engineering │    │  Training   │    │  Training   │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                                             │
                                             ▼
                                    ┌─────────────────┐
                                    │ Validation &    │
                                    │ Model Saving    │
                                    └─────────────────┘
```

### Configuracion del Trainer

```python
PVATrainerConfig:
    # Data
    timeframe: '5m'
    batch_size: 64
    sequence_length: 100
    target_horizon: 12  # candles ahead

    # Training
    encoder_epochs: 50
    encoder_learning_rate: 1e-4
    early_stopping_patience: 10

    # Validation
    val_ratio: 0.15
    walk_forward_splits: 5
    min_train_size: 10000
```

### Walk-Forward Validation

El modelo utiliza **walk-forward validation** para evaluar rendimiento:

```
Time ──────────────────────────────────────────────────────────────▶

Fold 1: [========= TRAIN =========][TEST]
Fold 2: [============= TRAIN =============][TEST]
Fold 3: [================= TRAIN =================][TEST]
Fold 4: [===================== TRAIN =====================][TEST]
Fold 5: [========================= TRAIN =========================][TEST]
```

**Caracteristicas:**
- Expanding window (ventana creciente)
- 5 folds por defecto
- Gap opcional entre train y test
- Metricas agregadas por fold

---

## Metricas de Evaluacion

### Metricas Primarias

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Direction Accuracy** | Precision en direccion (up/down) | >= 55% |
| **Magnitude MAE** | Error absoluto medio en magnitud | Minimo |
| **Directional Return** | Return promedio considerando direccion | > 0 |
| **Sharpe Proxy** | `mean(signed_returns) / std(signed_returns)` | > 1.0 |

### Metricas Secundarias

| Metrica | Descripcion |
|---------|-------------|
| **Encoder Loss** | MSE del autoencoder |
| **Confidence Calibration** | Alineacion confianza vs accuracy |
| **Per-Symbol Performance** | Metricas desglosadas por simbolo |

---

## API y Uso

### Clase Principal: PVAModel

```python
from models.strategies.pva import PVAModel, PVAConfig

# Configuracion
config = PVAConfig(
    input_features=15,
    sequence_length=100,
    d_model=256,
    n_heads=8,
    n_layers=4,
    d_ff=1024,
    dropout=0.1,
    device='cuda'
)

# Inicializar modelo
model = PVAModel(config)

# Entrenar encoder
history = model.fit_encoder(
    X_train, y_train,
    X_val, y_val,
    epochs=50,
    batch_size=64
)

# Entrenar XGBoost
metrics = model.fit_xgboost(X_train, y_train, X_val, y_val)

# Prediccion
predictions = model.predict(X_new)
for pred in predictions:
    print(f"Direction: {pred.direction}, Magnitude: {pred.magnitude}")
```

### Clase PVAPrediction

```python
@dataclass
class PVAPrediction:
    direction: float       # -1 to 1 (bearish to bullish)
    magnitude: float       # Expected absolute move
    confidence: float      # 0 to 1
    encoder_features: np.ndarray  # Latent representation

    @property
    def expected_return(self) -> float:
        return self.direction * self.magnitude

    @property
    def signal_strength(self) -> float:
        return abs(self.direction) * self.confidence
```

### Clase PVATrainer

```python
from models.strategies.pva import PVATrainer, PVATrainerConfig

# Configurar trainer
config = PVATrainerConfig(
    timeframe='5m',
    sequence_length=100,
    target_horizon=12,
    encoder_epochs=50
)

trainer = PVATrainer(config)

# Entrenar para un simbolo
model, metrics = trainer.train(
    symbol='XAUUSD',
    start_date='2023-01-01',
    end_date='2024-12-31'
)

# Walk-forward validation
results = trainer.walk_forward_train('XAUUSD', n_folds=5)
print(f"Avg Direction Accuracy: {results.avg_direction_accuracy:.2%}")

# Guardar modelo
trainer.save_model(model, 'XAUUSD', 'v1.0.0')
```

---

## Estructura de Archivos

```
apps/ml-engine/src/models/strategies/pva/
├── __init__.py
├── model.py              # PVAModel, PVAConfig, PVAPrediction
├── feature_engineering.py # PVAFeatureEngineer, PVAFeatureConfig
├── trainer.py            # PVATrainer, TrainingMetrics
└── attention.py          # PriceVariationAttention encoder
```

---

## Consideraciones de Produccion

### GPU Acceleration

```python
# Deteccion automatica de GPU
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = PVAModel(config, device=device)

# XGBoost con GPU
xgb_params = {
    'tree_method': 'gpu_hist',
    'device': 'cuda'
}
```

### Model Versioning

```
models/pva/{symbol}/{version}/
├── encoder.pt           # PyTorch encoder weights
├── xgb_direction.joblib # XGBoost direction classifier
├── xgb_magnitude.joblib # XGBoost magnitude regressor
├── config.json          # Model configuration
├── metadata.json        # Training metadata
└── feature_names.json   # Feature column names
```

### Inference Batch Size

| Escenario | Batch Size Recomendado |
|-----------|------------------------|
| Real-time single | 1 |
| Backtesting | 256 |
| Bulk inference | 1024 |

---

## Referencias

- [ET-ML-001: Arquitectura ML Engine](./ET-ML-001-arquitectura.md)
- [ET-ML-003: Feature Engineering](./ET-ML-003-features.md)
- [ET-ML-015: Backtesting Framework](./ET-ML-015-backtesting-framework.md)
- [Attention Is All You Need (Vaswani et al.)](https://arxiv.org/abs/1706.03762)

---

**Autor:** ML-Specialist (NEXUS v4.0)
**Fecha:** 2026-01-25