# Plan de Implementacion - ML Hierarchical Architecture **Version:** 1.0.0 **Fecha:** 2026-01-07 **Estado:** DRAFT - Pendiente Validacion --- ## Resumen Ejecutivo Este plan detalla las fases de implementacion para: 1. Walk-forward optimization (validar en mas periodos OOS) 2. Entrenar mas activos (BTCUSD, GBPUSD, USDJPY) 3. Neural Gating Training (completar integracion de datos) 4. Produccion (integrar con FastAPI) --- ## Estado Actual ### Modelos Existentes | Componente | XAUUSD | EURUSD | BTCUSD | GBPUSD | USDJPY | |------------|--------|--------|--------|--------|--------| | Attention 5m | OK | OK | NO | NO | NO | | Attention 15m | OK | OK | NO | NO | NO | | Base Model High 5m | OK | OK | NO | NO | NO | | Base Model Low 5m | OK | OK | NO | NO | NO | | Base Model High 15m | OK | OK | NO | NO | NO | | Base Model Low 15m | OK | OK | NO | NO | NO | | Metamodel XGBoost | OK | OK | NO | NO | NO | | Neural Gating | PARCIAL | NO | NO | NO | NO | | Backtest V2 | OK | OK | NO | NO | NO | ### Resultados Validados | Activo | Estrategia | Expectancy | Win Rate | PF | |--------|------------|------------|----------|-----| | XAUUSD | conservative | +0.0284 | 46.0% | 1.07 | | EURUSD | conservative | +0.0780 | 48.2% | 1.23 | --- ## FASE 1: Walk-Forward Optimization ### Objetivo Validar la estrategia "conservative" en multiples periodos OOS para confirmar robustez. ### Archivos Involucrados | Archivo | Accion | Proposito | |---------|--------|-----------| | `scripts/evaluate_hierarchical_v2.py` | EJECUTAR | Backtest con walk-forward | | `src/backtesting/engine.py` | LEER | Entender mecanismo actual | | `config/validation_oos.yaml` | MODIFICAR | Definir periodos walk-forward | ### Periodos de Validacion ```yaml walk_forward_periods: - name: "Q3-2024" start: "2024-07-01" end: "2024-09-30" - name: "Q4-2024" start: "2024-10-01" end: "2024-12-31" - name: "Full-2024-H2" start: "2024-07-01" end: "2024-12-31" ``` ### Dependencias - Modelos de Attention entrenados (OK) - Modelos Base entrenados (OK) - Metamodels entrenados (OK) - Datos historicos en BD (VERIFICAR) ### Metricas de Exito - Expectancy > 0 en TODOS los periodos - Win Rate > 40% - Profit Factor > 1.0 - Max Drawdown < 50R ### Comandos de Ejecucion ```bash cd /home/isem/workspace-v1/projects/trading-platform/apps/ml-engine # XAUUSD Walk-Forward python scripts/evaluate_hierarchical_v2.py \ --symbol XAUUSD \ --strategy conservative \ --start-date 2024-07-01 \ --end-date 2024-12-31 \ --output models/backtest_results_v2/walk_forward/ # EURUSD Walk-Forward python scripts/evaluate_hierarchical_v2.py \ --symbol EURUSD \ --strategy conservative \ --start-date 2024-07-01 \ --end-date 2024-12-31 \ --output models/backtest_results_v2/walk_forward/ ``` --- ## FASE 2: Entrenar Nuevos Activos ### Objetivo Extender la arquitectura jerarquica a BTCUSD, GBPUSD, USDJPY. ### Pre-requisitos (VERIFICAR) 1. **Datos en BD MySQL:** ```sql SELECT ticker, COUNT(*), MIN(date_agg), MAX(date_agg) FROM tickers_agg_data WHERE ticker IN ('X:BTCUSD', 'C:GBPUSD', 'C:USDJPY') GROUP BY ticker; ``` 2. **Minimo requerido:** - 5 anos de datos (2019-2024) - Minimo 500,000 registros por activo ### Sub-fases por Activo #### 2.1 BTCUSD ```yaml ticker_format: "X:BTCUSD" # Crypto usa prefijo X: caracteristicas: - Alta volatilidad (ATR ~$500-2000) - 24/7 trading - Correlacion baja con forex ``` **Comandos:** ```bash # Paso 1: Entrenar Attention Models python scripts/train_attention_model.py \ --symbols BTCUSD \ --timeframes 5m,15m \ --output models/attention/ # Paso 2: Entrenar Base Models python scripts/train_symbol_timeframe_models.py \ --symbols BTCUSD \ --timeframes 5m,15m \ --output models/symbol_timeframe_models/ # Paso 3: Entrenar Metamodel python scripts/train_metamodels.py \ --symbols BTCUSD \ --output models/metamodels/ # Paso 4: Backtest python scripts/evaluate_hierarchical_v2.py \ --symbol BTCUSD \ --strategy conservative ``` #### 2.2 GBPUSD ```yaml ticker_format: "C:GBPUSD" caracteristicas: - Volatilidad media (~60-100 pips diarios) - Sesiones London y NY - Correlacion alta con EURUSD ``` **Comandos:** ```bash python scripts/train_attention_model.py --symbols GBPUSD python scripts/train_symbol_timeframe_models.py --symbols GBPUSD python scripts/train_metamodels.py --symbols GBPUSD python scripts/evaluate_hierarchical_v2.py --symbol GBPUSD ``` #### 2.3 USDJPY ```yaml ticker_format: "C:USDJPY" caracteristicas: - Volatilidad baja-media (~50-80 pips diarios) - Activo en sesion Asia y NY - Correlacion inversa con XAUUSD ``` **Comandos:** ```bash python scripts/train_attention_model.py --symbols USDJPY python scripts/train_symbol_timeframe_models.py --symbols USDJPY python scripts/train_metamodels.py --symbols USDJPY python scripts/evaluate_hierarchical_v2.py --symbol USDJPY ``` ### Archivos a Modificar | Archivo | Cambio | Razon | |---------|--------|-------| | `config/models.yaml` | Agregar hiperparametros por activo | Volatilidad diferente | | `src/config/feature_flags.py` | Agregar SYMBOL_CONFIGS | Nuevos activos | | `src/data/features.py` | Verificar escala de ATR | BTC vs Forex | --- ## FASE 3: Neural Gating Training ### Objetivo Completar el entrenamiento de Neural Gating para TODOS los activos. ### Estado Actual | Activo | Neural Gating | XGBoost Comparison | |--------|---------------|-------------------| | XAUUSD | Training Data Cached | Pendiente | | EURUSD | NO | NO | | BTCUSD | NO | NO | | GBPUSD | NO | NO | | USDJPY | NO | NO | ### Archivos Clave | Archivo | Lineas | Proposito | |---------|--------|-----------| | `src/models/neural_gating_metamodel.py` | 853 | Arquitectura NN | | `scripts/train_neural_gating.py` | 286 | Script completo | | `scripts/train_neural_gating_simple.py` | 313 | Script simplificado | ### Arquitectura Neural Gating ``` Input Features (10): ├── pred_high_5m ├── pred_low_5m ├── pred_high_15m ├── pred_low_15m ├── attention_5m ├── attention_15m ├── attention_class_5m ├── attention_class_15m ├── ATR_ratio └── volume_z Networks: ├── GatingNetwork: [32, 16] -> alpha_high, alpha_low (0-1) ├── ResidualNetwork: [64, 32] -> residual_high, residual_low └── ConfidenceNetwork: [32, 16] -> confidence (0-1) Formula: delta_final = alpha * pred_5m + (1-alpha) * pred_15m + residual + softplus() ``` ### Comandos de Ejecucion ```bash # Opcion 1: Script Completo (usa training_data.joblib existente) python scripts/train_neural_gating.py \ --symbols XAUUSD,EURUSD \ --epochs 100 \ --compare # Opcion 2: Script Simple (genera datos desde HierarchicalPipeline) python scripts/train_neural_gating_simple.py \ --symbol XAUUSD \ --epochs 50 \ --compare # Para TODOS los activos (post-Fase 2) python scripts/train_neural_gating.py \ --symbols XAUUSD,EURUSD,BTCUSD,GBPUSD,USDJPY \ --output-dir models/metamodels_neural \ --epochs 100 \ --compare ``` ### Metricas de Comparacion | Metrica | Neural | XGBoost | Winner | |---------|--------|---------|--------| | MAE (avg) | TBD | TBD | TBD | | R2 (avg) | TBD | TBD | TBD | | Confidence Accuracy | TBD | TBD | TBD | | Improvement over avg | TBD | TBD | TBD | --- ## FASE 4: Integracion con FastAPI ### Objetivo Conectar los modelos ML con el API de produccion. ### Endpoints Existentes (Puerto 3083) | Endpoint | Metodo | Proposito | |----------|--------|-----------| | `/health` | GET | Health check | | `/predict/range` | POST | Prediccion delta high/low | | `/generate/signal` | POST | Genera signal completa | | `/api/ensemble/{symbol}` | POST | Signal ensemble | | `/ws/signals` | WS | Signals en tiempo real | ### Archivos a Verificar/Modificar | Archivo | Accion | Razon | |---------|--------|-------| | `src/api/main.py` | VERIFICAR | Endpoints existentes | | `src/services/prediction_service.py` | MODIFICAR | Integrar Neural Gating | | `src/services/hierarchical_predictor.py` | VERIFICAR | Pipeline de prediccion | ### Flujo de Integracion ``` 1. Request: POST /predict/range {symbol: "XAUUSD", timeframe: "15m"} │ ▼ 2. HierarchicalPredictor.predict() │ ├── Load Attention Model -> attention_score, attention_class │ ├── Load Base Models -> pred_high_5m, pred_low_5m, pred_high_15m, pred_low_15m │ ├── Compute Context -> ATR_ratio, volume_z │ ├── DECISION: XGBoost OR Neural Gating? │ │ │ ├── XGBoost: AssetMetamodel.predict() │ │ │ └── Neural: NeuralGatingMetamodel.predict() │ ▼ 3. Response: {delta_high, delta_low, confidence, ...} ``` ### Configuracion de Seleccion de Modelo ```yaml # config/models.yaml metamodel: default: "xgboost" # o "neural_gating" per_symbol: XAUUSD: "neural_gating" # Si Neural supera XGBoost EURUSD: "xgboost" BTCUSD: "xgboost" ``` ### Comandos de Verificacion ```bash # Iniciar API cd /home/isem/workspace-v1/projects/trading-platform/apps/ml-engine uvicorn src.api.main:app --host 0.0.0.0 --port 3083 --reload # Test endpoint curl -X POST http://localhost:3083/predict/range \ -H "Content-Type: application/json" \ -d '{"symbol": "XAUUSD", "timeframe": "15m"}' # Health check curl http://localhost:3083/health ``` --- ## FASE 5: Validacion Final ### Checklist de Validacion #### 5.1 Walk-Forward - [ ] XAUUSD Q3-2024: Expectancy > 0 - [ ] XAUUSD Q4-2024: Expectancy > 0 - [ ] EURUSD Q3-2024: Expectancy > 0 - [ ] EURUSD Q4-2024: Expectancy > 0 #### 5.2 Nuevos Activos - [ ] BTCUSD: Todos los modelos entrenados - [ ] BTCUSD: Backtest con expectancy > 0 - [ ] GBPUSD: Todos los modelos entrenados - [ ] GBPUSD: Backtest con expectancy > 0 - [ ] USDJPY: Todos los modelos entrenados - [ ] USDJPY: Backtest con expectancy > 0 #### 5.3 Neural Gating - [ ] XAUUSD: Neural vs XGBoost comparado - [ ] EURUSD: Neural vs XGBoost comparado - [ ] Winner seleccionado por activo #### 5.4 API - [ ] /predict/range funcional para todos los activos - [ ] /generate/signal funcional - [ ] WebSocket /ws/signals funcional - [ ] Backend Express puede comunicarse --- ## Dependencias Entre Fases ``` FASE 1 (Walk-Forward) │ └── No tiene dependencias previas FASE 2 (Nuevos Activos) │ └── Depende de: Datos disponibles en BD MySQL FASE 3 (Neural Gating) │ ├── XAUUSD/EURUSD: Sin dependencias (modelos existentes) └── BTCUSD/GBPUSD/USDJPY: Depende de FASE 2 FASE 4 (FastAPI) │ ├── Depende de: FASE 3 (saber cual modelo usar) └── Puede ejecutarse en paralelo con FASE 1-2 FASE 5 (Validacion) │ └── Depende de: Todas las fases anteriores ``` --- ## Estimacion de Recursos ### Tiempo de Entrenamiento (GPU RTX 5060 Ti) | Modelo | Por Activo | Total (5 activos) | |--------|------------|-------------------| | Attention | ~10 min | ~50 min | | Base Models | ~30 min | ~2.5 hrs | | Metamodel XGBoost | ~5 min | ~25 min | | Neural Gating | ~15 min | ~1.25 hrs | | Backtest | ~30 min | ~2.5 hrs | | **TOTAL** | ~1.5 hrs | ~7-8 hrs | ### Espacio en Disco | Componente | Por Activo | Total | |------------|------------|-------| | Attention | ~2 MB | ~10 MB | | Base Models | ~1.5 MB | ~7.5 MB | | Metamodel | ~300 KB | ~1.5 MB | | Neural Gating | ~50 KB | ~250 KB | | Backtest Results | ~100 KB | ~500 KB | | **TOTAL** | ~4 MB | ~20 MB | --- ## Proximos Pasos Inmediatos 1. **HOY:** Validar datos en BD para BTCUSD, GBPUSD, USDJPY 2. **EJECUTAR:** Walk-Forward para XAUUSD y EURUSD 3. **DECIDIR:** Orden de prioridad de nuevos activos 4. **ENTRENAR:** Primer activo nuevo (sugerido: GBPUSD por correlacion con EURUSD) --- --- ## VALIDACION DEL PLAN vs ANALISIS ### Requisitos Originales vs Cobertura | Requisito | Cubierto | Fase | Detalle | |-----------|----------|------|---------| | Walk-forward optimization | SI | FASE 1 | Periodos Q3-2024, Q4-2024 definidos | | Entrenar BTCUSD | SI | FASE 2.1 | Ticker: X:BTCUSD | | Entrenar GBPUSD | SI | FASE 2.2 | Ticker: C:GBPUSD | | Entrenar USDJPY | SI | FASE 2.3 | Ticker: C:USDJPY | | Neural Gating Training | SI | FASE 3 | Scripts identificados | | Integracion FastAPI | SI | FASE 4 | Endpoints documentados | ### Archivos Criticos Identificados ``` TIER 1 (Base - Sin dependencias): ├── config/database.yaml [EXISTE] └── src/data/database.py [EXISTE] TIER 2 (Entrenamiento Independiente): ├── scripts/train_attention_model.py [EXISTE] │ └── Genera: models/attention/{symbol}_{tf}_attention/ ├── scripts/train_symbol_timeframe_models.py [EXISTE] │ └── Genera: models/symbol_timeframe_models/{symbol}_{tf}_{target}_h3.joblib TIER 3 (Requiere TIER 2): ├── scripts/train_metamodels.py [EXISTE] │ ├── Carga: models/attention/* [REQUERIDO] │ ├── Carga: models/symbol_timeframe_models/* [REQUERIDO] │ └── Genera: models/metamodels/{symbol}/ TIER 4 (Evaluacion - Requiere TIER 3): ├── scripts/evaluate_hierarchical_v2.py [EXISTE] │ ├── Carga: models/attention/* [REQUERIDO] │ ├── Carga: models/symbol_timeframe_models/* [REQUERIDO] │ ├── Carga: models/metamodels/* [REQUERIDO] │ └── Genera: models/backtest_results_v2/ TIER 5 (Neural Gating - Paralelo a TIER 3): ├── scripts/train_neural_gating.py [EXISTE] │ ├── Carga: models/metamodels/trainer_metadata.joblib [REQUERIDO] │ └── Genera: models/metamodels_neural/{symbol}/ ├── scripts/train_neural_gating_simple.py [EXISTE] │ ├── Carga: HierarchicalPipeline (modelos TIER 2-3) │ └── Genera: models/metamodels_neural/{symbol}/ ``` ### Orden de Ejecucion por Nuevo Activo ``` Para entrenar BTCUSD/GBPUSD/USDJPY: 1. Verificar datos en BD: SELECT COUNT(*), MIN(date_agg), MAX(date_agg) FROM tickers_agg_data WHERE ticker IN ('X:BTCUSD', 'C:GBPUSD', 'C:USDJPY'); 2. Entrenar Attention (independiente): python scripts/train_attention_model.py --symbols {SYMBOL} 3. Entrenar Base Models (independiente): python scripts/train_symbol_timeframe_models.py --symbols {SYMBOL} 4. Entrenar Metamodel (requiere 2 y 3): python scripts/train_metamodels.py --symbols {SYMBOL} 5. Evaluar (requiere 4): python scripts/evaluate_hierarchical_v2.py --symbols {SYMBOL} 6. (Opcional) Neural Gating (requiere 4): python scripts/train_neural_gating.py --symbols {SYMBOL} ``` ### Datos Requeridos en BD MySQL | Tabla | Campo | Tipo | Requerido | |-------|-------|------|-----------| | tickers_agg_data | date_agg | DATETIME | SI | | tickers_agg_data | ticker | VARCHAR | SI | | tickers_agg_data | open | DECIMAL | SI | | tickers_agg_data | high | DECIMAL | SI | | tickers_agg_data | low | DECIMAL | SI | | tickers_agg_data | close | DECIMAL | SI | | tickers_agg_data | volume | BIGINT | SI | | tickers_agg_data | vwap | DECIMAL | Opcional | **Minimos por Activo:** - Registros: 50,000+ (5m data) - Periodo: 2019-01-01 a 2024-12-31 (5 anos) ### Riesgos Identificados | Riesgo | Probabilidad | Impacto | Mitigacion | |--------|--------------|---------|------------| | Datos insuficientes para BTCUSD/GBPUSD/USDJPY | MEDIA | ALTO | Verificar BD antes de entrenar | | Neural Gating no supera XGBoost | BAJA | BAJO | Mantener XGBoost como fallback | | Walk-forward muestra sobreajuste | MEDIA | ALTO | Ajustar filtros de estrategia | | API no soporta nuevos activos | BAJA | MEDIO | Verificar endpoints existentes | --- ## MATRIZ DE TRAZABILIDAD ### Archivos Modificados vs Fases | Archivo | FASE 1 | FASE 2 | FASE 3 | FASE 4 | |---------|--------|--------|--------|--------| | config/validation_oos.yaml | MODIFICAR | - | - | - | | config/models.yaml | - | MODIFICAR | - | MODIFICAR | | scripts/train_attention_model.py | - | EJECUTAR | - | - | | scripts/train_symbol_timeframe_models.py | - | EJECUTAR | - | - | | scripts/train_metamodels.py | - | EJECUTAR | - | - | | scripts/train_neural_gating.py | - | - | EJECUTAR | - | | scripts/evaluate_hierarchical_v2.py | EJECUTAR | EJECUTAR | - | - | | src/services/prediction_service.py | - | - | - | MODIFICAR | | src/api/main.py | - | - | - | VERIFICAR | ### Modelos Generados vs Fases | Modelo | FASE 1 | FASE 2 | FASE 3 | FASE 4 | |--------|--------|--------|--------|--------| | models/attention/BTCUSD_* | - | GENERAR | - | - | | models/attention/GBPUSD_* | - | GENERAR | - | - | | models/attention/USDJPY_* | - | GENERAR | - | - | | models/symbol_timeframe_models/BTCUSD_* | - | GENERAR | - | - | | models/symbol_timeframe_models/GBPUSD_* | - | GENERAR | - | - | | models/symbol_timeframe_models/USDJPY_* | - | GENERAR | - | - | | models/metamodels/BTCUSD/ | - | GENERAR | - | - | | models/metamodels/GBPUSD/ | - | GENERAR | - | - | | models/metamodels/USDJPY/ | - | GENERAR | - | - | | models/metamodels_neural/XAUUSD/ | - | - | GENERAR | - | | models/metamodels_neural/EURUSD/ | - | - | GENERAR | - | | models/backtest_results_v2/walk_forward/ | GENERAR | - | - | - | --- **Documento generado automaticamente** **ML-Specialist-Agent | Trading Platform** **Version:** 1.1.0 | **Fecha:** 2026-01-07