| id |
title |
type |
status |
priority |
epic |
project |
version |
created_date |
updated_date |
task_reference |
| ET-ML-010 |
PVA (Price Variation Attention) Strategy |
Technical Specification |
Approved |
Alta |
OQI-006 |
trading-platform |
1.0.0 |
2026-01-25 |
2026-01-25 |
TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |
ET-ML-010: PVA (Price Variation Attention) Strategy
Metadata
| Campo |
Valor |
| ID |
ET-ML-010 |
| Epica |
OQI-006 - Senales ML |
| Tipo |
Especificacion Tecnica |
| Version |
1.0.0 |
| Estado |
Aprobado |
| Ultima actualizacion |
2026-01-25 |
| Tarea Referencia |
TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |
Resumen
La estrategia PVA (Price Variation Attention) es un modelo hibrido que combina un Transformer Encoder para aprendizaje de representaciones y XGBoost para predicciones finales. El modelo predice la direccion y magnitud de variaciones de precio en un horizonte futuro.
Caracteristicas Clave
- Diseno Time-Agnostic: No usa features temporales (hora, dia) para evitar sobreajuste
- 6 Modelos Independientes: Un modelo por simbolo (XAUUSD, EURUSD, GBPUSD, USDJPY, BTCUSD, ETHUSD)
- Arquitectura Hibrida: Transformer Encoder + XGBoost Head
- Prediction Targets: Direccion (bullish/bearish) y magnitud del movimiento
Arquitectura
Diagrama de Alto Nivel
┌─────────────────────────────────────────────────────────────────────────┐
│ PVA MODEL │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Input: OHLCV Sequence (seq_len x n_features) │
│ └── Returns, Acceleration, Volatility features │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ TRANSFORMER ENCODER │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Input Linear │──▶│ Positional │──▶│ Encoder │ │ │
│ │ │ Projection │ │ Encoding │ │ Layers (4) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────┐ │ │
│ │ │ Sequence Pooling │ │ │
│ │ │ (Mean + Last) │ │ │
│ │ └──────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ XGBOOST HEAD │ │
│ │ ┌──────────────────────┐ ┌──────────────────────┐ │ │
│ │ │ Direction Classifier │ │ Magnitude Regressor │ │ │
│ │ │ (binary: up/down) │ │ (absolute magnitude) │ │ │
│ │ └──────────────────────┘ └──────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Output: PVAPrediction │
│ - direction: float (-1 to 1) │
│ - magnitude: float (absolute expected move) │
│ - confidence: float (0 to 1) │
└─────────────────────────────────────────────────────────────────────────┘
Componentes del Transformer Encoder
| Componente |
Configuracion |
| Layers |
4 encoder layers |
| d_model |
256 |
| n_heads |
8 attention heads |
| d_ff |
1024 (feed-forward dimension) |
| dropout |
0.1 |
| positional_encoding |
Sinusoidal |
| sequence_length |
100 candles |
Configuracion XGBoost
| Parametro |
Valor |
| n_estimators |
200 |
| max_depth |
6 |
| learning_rate |
0.05 |
| subsample |
0.8 |
| colsample_bytree |
0.8 |
| reg_alpha |
0.1 |
| reg_lambda |
1.0 |
Feature Engineering
Diseno Time-Agnostic
El modelo no utiliza features temporales (hora del dia, dia de la semana) para:
- Evitar sobreajuste a patrones temporales especificos
- Mejorar generalizacion a diferentes condiciones de mercado
- Reducir el riesgo de concept drift
Features Implementados
PVAFeatureConfig:
return_periods: [1, 5, 10, 20]
volatility_window: 20
stats_window: 50
sequence_length: 100
1. Return Features
| Feature |
Descripcion |
Formula |
return_1 |
Return 1 periodo |
(close / close.shift(1)) - 1 |
return_5 |
Return 5 periodos |
(close / close.shift(5)) - 1 |
return_10 |
Return 10 periodos |
(close / close.shift(10)) - 1 |
return_20 |
Return 20 periodos |
(close / close.shift(20)) - 1 |
2. Acceleration Features
| Feature |
Descripcion |
Formula |
acceleration_1 |
Cambio en momentum corto |
return_1 - return_1.shift(1) |
acceleration_5 |
Cambio en momentum medio |
return_5 - return_5.shift(5) |
acceleration_20 |
Cambio en momentum largo |
return_20 - return_20.shift(20) |
3. Volatility Features
| Feature |
Descripcion |
Formula |
volatility_returns |
Volatilidad de returns |
return_1.rolling(20).std() |
volatility_ratio |
Ratio volatilidad actual/promedio |
volatility / volatility.rolling(50).mean() |
range_volatility |
Volatilidad de rangos |
((high - low) / close).rolling(20).std() |
4. Statistical Features
| Feature |
Descripcion |
Formula |
return_skew |
Sesgo de returns |
return_1.rolling(50).skew() |
return_kurt |
Curtosis de returns |
return_1.rolling(50).kurt() |
zscore_return |
Z-score del return |
(return_1 - mean) / std |
Pipeline de Entrenamiento
Flujo de Entrenamiento
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Data │───▶│ Feature │───▶│ Encoder │───▶│ XGBoost │
│ Loading │ │ Engineering │ │ Training │ │ Training │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────────┐
│ Validation & │
│ Model Saving │
└─────────────────┘
Configuracion del Trainer
PVATrainerConfig:
# Data
timeframe: '5m'
batch_size: 64
sequence_length: 100
target_horizon: 12 # candles ahead
# Training
encoder_epochs: 50
encoder_learning_rate: 1e-4
early_stopping_patience: 10
# Validation
val_ratio: 0.15
walk_forward_splits: 5
min_train_size: 10000
Walk-Forward Validation
El modelo utiliza walk-forward validation para evaluar rendimiento:
Time ──────────────────────────────────────────────────────────────▶
Fold 1: [========= TRAIN =========][TEST]
Fold 2: [============= TRAIN =============][TEST]
Fold 3: [================= TRAIN =================][TEST]
Fold 4: [===================== TRAIN =====================][TEST]
Fold 5: [========================= TRAIN =========================][TEST]
Caracteristicas:
- Expanding window (ventana creciente)
- 5 folds por defecto
- Gap opcional entre train y test
- Metricas agregadas por fold
Metricas de Evaluacion
Metricas Primarias
| Metrica |
Descripcion |
Target |
| Direction Accuracy |
Precision en direccion (up/down) |
>= 55% |
| Magnitude MAE |
Error absoluto medio en magnitud |
Minimo |
| Directional Return |
Return promedio considerando direccion |
> 0 |
| Sharpe Proxy |
mean(signed_returns) / std(signed_returns) |
> 1.0 |
Metricas Secundarias
| Metrica |
Descripcion |
| Encoder Loss |
MSE del autoencoder |
| Confidence Calibration |
Alineacion confianza vs accuracy |
| Per-Symbol Performance |
Metricas desglosadas por simbolo |
API y Uso
Clase Principal: PVAModel
from models.strategies.pva import PVAModel, PVAConfig
# Configuracion
config = PVAConfig(
input_features=15,
sequence_length=100,
d_model=256,
n_heads=8,
n_layers=4,
d_ff=1024,
dropout=0.1,
device='cuda'
)
# Inicializar modelo
model = PVAModel(config)
# Entrenar encoder
history = model.fit_encoder(
X_train, y_train,
X_val, y_val,
epochs=50,
batch_size=64
)
# Entrenar XGBoost
metrics = model.fit_xgboost(X_train, y_train, X_val, y_val)
# Prediccion
predictions = model.predict(X_new)
for pred in predictions:
print(f"Direction: {pred.direction}, Magnitude: {pred.magnitude}")
Clase PVAPrediction
@dataclass
class PVAPrediction:
direction: float # -1 to 1 (bearish to bullish)
magnitude: float # Expected absolute move
confidence: float # 0 to 1
encoder_features: np.ndarray # Latent representation
@property
def expected_return(self) -> float:
return self.direction * self.magnitude
@property
def signal_strength(self) -> float:
return abs(self.direction) * self.confidence
Clase PVATrainer
from models.strategies.pva import PVATrainer, PVATrainerConfig
# Configurar trainer
config = PVATrainerConfig(
timeframe='5m',
sequence_length=100,
target_horizon=12,
encoder_epochs=50
)
trainer = PVATrainer(config)
# Entrenar para un simbolo
model, metrics = trainer.train(
symbol='XAUUSD',
start_date='2023-01-01',
end_date='2024-12-31'
)
# Walk-forward validation
results = trainer.walk_forward_train('XAUUSD', n_folds=5)
print(f"Avg Direction Accuracy: {results.avg_direction_accuracy:.2%}")
# Guardar modelo
trainer.save_model(model, 'XAUUSD', 'v1.0.0')
Estructura de Archivos
apps/ml-engine/src/models/strategies/pva/
├── __init__.py
├── model.py # PVAModel, PVAConfig, PVAPrediction
├── feature_engineering.py # PVAFeatureEngineer, PVAFeatureConfig
├── trainer.py # PVATrainer, TrainingMetrics
└── attention.py # PriceVariationAttention encoder
Consideraciones de Produccion
GPU Acceleration
# Deteccion automatica de GPU
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = PVAModel(config, device=device)
# XGBoost con GPU
xgb_params = {
'tree_method': 'gpu_hist',
'device': 'cuda'
}
Model Versioning
models/pva/{symbol}/{version}/
├── encoder.pt # PyTorch encoder weights
├── xgb_direction.joblib # XGBoost direction classifier
├── xgb_magnitude.joblib # XGBoost magnitude regressor
├── config.json # Model configuration
├── metadata.json # Training metadata
└── feature_names.json # Feature column names
Inference Batch Size
| Escenario |
Batch Size Recomendado |
| Real-time single |
1 |
| Backtesting |
256 |
| Bulk inference |
1024 |
Referencias
Autor: ML-Specialist (NEXUS v4.0)
Fecha: 2026-01-25