Adrian Flores Cortes f1174723ed feat: Add comprehensive analysis and integration plan for trading-platform

- Created TASK-2026-01-26-ANALYSIS-INTEGRATION-PLAN with complete CAPVED documentation
- Orchestrated 5 specialized Explore agents in parallel (85% time reduction)
- Identified 7 coherence gaps (DDL↔Backend↔Frontend)
- Identified 4 P0 blockers preventing GO-LIVE
- Documented 58 missing documentation items
- Created detailed roadmap Q1-Q4 2026 (2,500h total)
- Added 6 new ET specs for ML strategies (PVA, MRD, VBP, MSA, MTS, Backtesting)
- Updated _INDEX.yml with new analysis task

Hallazgos críticos:
- E-COH-001 to E-COH-007: Coherence gaps (6.5h to fix)
- BLOCKER-001 to 004: Token refresh, PCI-DSS, Video upload, MT4 Gateway (380h)
- Documentation gaps: 8 ET specs, 8 US, 34 Swagger docs (47.5h)

Roadmap phases:
- Q1: Security & Blockers (249h)
- Q2: Core Features + GO-LIVE (542h)
- Q3: Scalability & Performance (380h)
- Q4: Innovation & Advanced Features (1,514h)

ROI: $223k investment → $750k revenue → $468k net profit (165% ROI)

Next: Execute ST1 (Coherencia Fixes P0)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-26 16:40:56 -06:00

14 KiB

Raw Blame History

id	title	type	status	priority	epic	project	version	created_date	updated_date	task_reference
ET-ML-010	PVA (Price Variation Attention) Strategy	Technical Specification	Approved	Alta	OQI-006	trading-platform	1.0.0	2026-01-25	2026-01-25	TASK-2026-01-25-ML-TRAINING-ENHANCEMENT

ET-ML-010: PVA (Price Variation Attention) Strategy

Metadata

Campo	Valor
ID	ET-ML-010
Epica	OQI-006 - Senales ML
Tipo	Especificacion Tecnica
Version	1.0.0
Estado	Aprobado
Ultima actualizacion	2026-01-25
Tarea Referencia	TASK-2026-01-25-ML-TRAINING-ENHANCEMENT

Resumen

La estrategia PVA (Price Variation Attention) es un modelo hibrido que combina un Transformer Encoder para aprendizaje de representaciones y XGBoost para predicciones finales. El modelo predice la direccion y magnitud de variaciones de precio en un horizonte futuro.

Caracteristicas Clave

Diseno Time-Agnostic: No usa features temporales (hora, dia) para evitar sobreajuste
6 Modelos Independientes: Un modelo por simbolo (XAUUSD, EURUSD, GBPUSD, USDJPY, BTCUSD, ETHUSD)
Arquitectura Hibrida: Transformer Encoder + XGBoost Head
Prediction Targets: Direccion (bullish/bearish) y magnitud del movimiento

Arquitectura

Diagrama de Alto Nivel

┌─────────────────────────────────────────────────────────────────────────┐
│                        PVA MODEL                                         │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Input: OHLCV Sequence (seq_len x n_features)                           │
│         └── Returns, Acceleration, Volatility features                  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                  TRANSFORMER ENCODER                               │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐            │  │
│  │  │ Input Linear │──▶│ Positional   │──▶│ Encoder      │           │  │
│  │  │ Projection   │  │ Encoding     │  │ Layers (4)   │           │  │
│  │  └──────────────┘  └──────────────┘  └──────────────┘            │  │
│  │                                              │                     │  │
│  │                                              ▼                     │  │
│  │                                    ┌──────────────────┐           │  │
│  │                                    │ Sequence Pooling │           │  │
│  │                                    │ (Mean + Last)    │           │  │
│  │                                    └──────────────────┘           │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                    XGBOOST HEAD                                    │  │
│  │  ┌──────────────────────┐    ┌──────────────────────┐            │  │
│  │  │ Direction Classifier │    │ Magnitude Regressor  │            │  │
│  │  │ (binary: up/down)    │    │ (absolute magnitude) │            │  │
│  │  └──────────────────────┘    └──────────────────────┘            │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  Output: PVAPrediction                                                   │
│          - direction: float (-1 to 1)                                    │
│          - magnitude: float (absolute expected move)                     │
│          - confidence: float (0 to 1)                                    │
└─────────────────────────────────────────────────────────────────────────┘

Componentes del Transformer Encoder

Componente	Configuracion
Layers	4 encoder layers
d_model	256
n_heads	8 attention heads
d_ff	1024 (feed-forward dimension)
dropout	0.1
positional_encoding	Sinusoidal
sequence_length	100 candles

Configuracion XGBoost

Parametro	Valor
n_estimators	200
max_depth	6
learning_rate	0.05
subsample	0.8
colsample_bytree	0.8
reg_alpha	0.1
reg_lambda	1.0

Feature Engineering

Diseno Time-Agnostic

El modelo no utiliza features temporales (hora del dia, dia de la semana) para:

Evitar sobreajuste a patrones temporales especificos
Mejorar generalizacion a diferentes condiciones de mercado
Reducir el riesgo de concept drift

Features Implementados

PVAFeatureConfig:
    return_periods: [1, 5, 10, 20]
    volatility_window: 20
    stats_window: 50
    sequence_length: 100

1. Return Features

Feature	Descripcion	Formula
`return_1`	Return 1 periodo	`(close / close.shift(1)) - 1`
`return_5`	Return 5 periodos	`(close / close.shift(5)) - 1`
`return_10`	Return 10 periodos	`(close / close.shift(10)) - 1`
`return_20`	Return 20 periodos	`(close / close.shift(20)) - 1`

2. Acceleration Features

Feature	Descripcion	Formula
`acceleration_1`	Cambio en momentum corto	`return_1 - return_1.shift(1)`
`acceleration_5`	Cambio en momentum medio	`return_5 - return_5.shift(5)`
`acceleration_20`	Cambio en momentum largo	`return_20 - return_20.shift(20)`

3. Volatility Features

Feature	Descripcion	Formula
`volatility_returns`	Volatilidad de returns	`return_1.rolling(20).std()`
`volatility_ratio`	Ratio volatilidad actual/promedio	`volatility / volatility.rolling(50).mean()`
`range_volatility`	Volatilidad de rangos	`((high - low) / close).rolling(20).std()`

4. Statistical Features

Feature	Descripcion	Formula
`return_skew`	Sesgo de returns	`return_1.rolling(50).skew()`
`return_kurt`	Curtosis de returns	`return_1.rolling(50).kurt()`
`zscore_return`	Z-score del return	`(return_1 - mean) / std`

Pipeline de Entrenamiento

Flujo de Entrenamiento

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Data      │───▶│  Feature    │───▶│  Encoder    │───▶│  XGBoost    │
│  Loading    │    │ Engineering │    │  Training   │    │  Training   │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                                             │
                                             ▼
                                    ┌─────────────────┐
                                    │ Validation &    │
                                    │ Model Saving    │
                                    └─────────────────┘

Configuracion del Trainer

PVATrainerConfig:
    # Data
    timeframe: '5m'
    batch_size: 64
    sequence_length: 100
    target_horizon: 12  # candles ahead

    # Training
    encoder_epochs: 50
    encoder_learning_rate: 1e-4
    early_stopping_patience: 10

    # Validation
    val_ratio: 0.15
    walk_forward_splits: 5
    min_train_size: 10000

Walk-Forward Validation

El modelo utiliza walk-forward validation para evaluar rendimiento:

Time ──────────────────────────────────────────────────────────────▶

Fold 1: [========= TRAIN =========][TEST]
Fold 2: [============= TRAIN =============][TEST]
Fold 3: [================= TRAIN =================][TEST]
Fold 4: [===================== TRAIN =====================][TEST]
Fold 5: [========================= TRAIN =========================][TEST]

Caracteristicas:

Expanding window (ventana creciente)
5 folds por defecto
Gap opcional entre train y test
Metricas agregadas por fold

Metricas de Evaluacion

Metricas Primarias

Metrica	Descripcion	Target
Direction Accuracy	Precision en direccion (up/down)	>= 55%
Magnitude MAE	Error absoluto medio en magnitud	Minimo
Directional Return	Return promedio considerando direccion	> 0
Sharpe Proxy	`mean(signed_returns) / std(signed_returns)`	> 1.0

Metricas Secundarias

Metrica	Descripcion
Encoder Loss	MSE del autoencoder
Confidence Calibration	Alineacion confianza vs accuracy
Per-Symbol Performance	Metricas desglosadas por simbolo

API y Uso

Clase Principal: PVAModel

from models.strategies.pva import PVAModel, PVAConfig

# Configuracion
config = PVAConfig(
    input_features=15,
    sequence_length=100,
    d_model=256,
    n_heads=8,
    n_layers=4,
    d_ff=1024,
    dropout=0.1,
    device='cuda'
)

# Inicializar modelo
model = PVAModel(config)

# Entrenar encoder
history = model.fit_encoder(
    X_train, y_train,
    X_val, y_val,
    epochs=50,
    batch_size=64
)

# Entrenar XGBoost
metrics = model.fit_xgboost(X_train, y_train, X_val, y_val)

# Prediccion
predictions = model.predict(X_new)
for pred in predictions:
    print(f"Direction: {pred.direction}, Magnitude: {pred.magnitude}")

Clase PVAPrediction

@dataclass
class PVAPrediction:
    direction: float       # -1 to 1 (bearish to bullish)
    magnitude: float       # Expected absolute move
    confidence: float      # 0 to 1
    encoder_features: np.ndarray  # Latent representation

    @property
    def expected_return(self) -> float:
        return self.direction * self.magnitude

    @property
    def signal_strength(self) -> float:
        return abs(self.direction) * self.confidence

Clase PVATrainer

from models.strategies.pva import PVATrainer, PVATrainerConfig

# Configurar trainer
config = PVATrainerConfig(
    timeframe='5m',
    sequence_length=100,
    target_horizon=12,
    encoder_epochs=50
)

trainer = PVATrainer(config)

# Entrenar para un simbolo
model, metrics = trainer.train(
    symbol='XAUUSD',
    start_date='2023-01-01',
    end_date='2024-12-31'
)

# Walk-forward validation
results = trainer.walk_forward_train('XAUUSD', n_folds=5)
print(f"Avg Direction Accuracy: {results.avg_direction_accuracy:.2%}")

# Guardar modelo
trainer.save_model(model, 'XAUUSD', 'v1.0.0')

Estructura de Archivos

apps/ml-engine/src/models/strategies/pva/
├── __init__.py
├── model.py              # PVAModel, PVAConfig, PVAPrediction
├── feature_engineering.py # PVAFeatureEngineer, PVAFeatureConfig
├── trainer.py            # PVATrainer, TrainingMetrics
└── attention.py          # PriceVariationAttention encoder

Consideraciones de Produccion

GPU Acceleration

# Deteccion automatica de GPU
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = PVAModel(config, device=device)

# XGBoost con GPU
xgb_params = {
    'tree_method': 'gpu_hist',
    'device': 'cuda'
}

Model Versioning

models/pva/{symbol}/{version}/
├── encoder.pt           # PyTorch encoder weights
├── xgb_direction.joblib # XGBoost direction classifier
├── xgb_magnitude.joblib # XGBoost magnitude regressor
├── config.json          # Model configuration
├── metadata.json        # Training metadata
└── feature_names.json   # Feature column names

Inference Batch Size

Escenario	Batch Size Recomendado
Real-time single	1
Backtesting	256
Bulk inference	1024

Referencias

Autor: ML-Specialist (NEXUS v4.0) Fecha: 2026-01-25

14 KiB Raw Blame History