trading-platform/docs/99-analisis/ET-ML-FACTORES-ATENCION-SPEC.md

---
title: "Especificación Técnica: Sistema de Atención con Factores Dinámicos"
version: "1.0.0"
date: "2026-01-06"
status: "Draft"
author: "ML-Specialist + Orquestador"
epic: "OQI-006"
tags: ["ml", "attention", "factors", "dynamic", "specification"]
priority: "HIGH"
---

# ET-ML-FACTORES-ATENCION: Sistema de Atención con Factores Dinámicos

## 1. RESUMEN

Este documento especifica la implementación de un sistema de atención basado en factores dinámicos calculados con ATR/mediana rolling, eliminando los factores hardcodeados actuales y permitiendo escalabilidad a 100+ activos.

---

## 2. PROBLEMA ACTUAL

### 2.1 Factores Hardcodeados

**Ubicación actual**: `range_predictor_factor.py:598-601`

```python
# PROBLEMA: Solo 2 activos, valores estáticos
SYMBOLS = {
    'XAUUSD': {'base': 2650.0, 'volatility': 0.0012, 'factor': 2.5},
    'EURUSD': {'base': 1.0420, 'volatility': 0.0004, 'factor': 0.0003},
}
```

### 2.2 Impactos

| Impacto | Descripción | Severidad |
|---------|-------------|-----------|
| Escalabilidad | No escala a 100+ activos | CRÍTICO |
| Adaptabilidad | No se adapta a cambios de volatilidad | ALTO |
| Mantenimiento | Requiere código nuevo por cada activo | MEDIO |
| Precisión | Factores desactualizados degradan predicciones | ALTO |

---

## 3. SOLUCIÓN PROPUESTA

### 3.1 Cálculo Dinámico del Factor

```python
def compute_factor_median_range(
    df: pd.DataFrame,
    window: int = 200,
    min_periods: int = 100
) -> pd.Series:
    """
    Factor dinámico = mediana rolling del rango de velas con shift(1).

    El shift(1) evita data leakage - solo usa información pasada.

    Args:
        df: DataFrame con High/Low
        window: Ventana rolling (default: 200 velas)
        min_periods: Períodos mínimos para calcular

    Returns:
        Serie con factor dinámico por timestamp
    """
    range_col = df['High'] - df['Low']
    factor = range_col.rolling(window=window, min_periods=min_periods).median().shift(1)
    return factor
```

### 3.2 Mapeo de Pesos de Atención

**Función smooth (softplus)**:
```python
def weight_smooth(m: np.ndarray, w_max: float = 3.0, beta: float = 4.0) -> np.ndarray:
    """
    Mapeo suave de multiplicador a peso de atención.

    Formula: w = log1p(exp(beta * (m - 1))) / beta

    Interpretación:
    - m < 1 → w ≈ 0 (ruido, ignorar)
    - m = 1 → w ≈ 0 (movimiento típico)
    - m = 2 → w ≈ 1 (2x normal, atención media)
    - m = 3 → w ≈ 2 (3x normal, atención alta)
    """
    x = beta * (m - 1.0)
    w = np.where(x > 20, x / beta, np.log1p(np.exp(x)) / beta)
    return np.clip(w, 0.0, w_max)
```

### 3.3 Ejemplo para XAUUSD

| Variación Real | Factor Dinámico | Multiplicador | Peso de Atención |
|----------------|-----------------|---------------|------------------|
| 3.5 USD | 5.0 USD | 0.70 | **0.0** (ruido) |
| 5.0 USD | 5.0 USD | 1.00 | **0.0** (normal) |
| 7.5 USD | 5.0 USD | 1.50 | **~0.4** (interés) |
| 10.0 USD | 5.0 USD | 2.00 | **~1.0** (atención) |
| 15.0 USD | 5.0 USD | 3.00 | **~2.0** (alta atención) |
| 20.0 USD | 5.0 USD | 4.00 | **3.0** (máximo) |

---

## 4. ARQUITECTURA DE IMPLEMENTACIÓN

### 4.1 Estructura de Archivos Propuesta

```
ml-engine/src/
├── config/
│   ├── symbols_config.yaml     # Configuración de símbolos (NEW)
│   └── attention_config.yaml   # Configuración de atención (NEW)
├── models/
│   ├── base/
│   │   └── attention_weighted_model.py  # Base class (NEW)
│   ├── trained/
│   │   ├── XAUUSD/
│   │   │   ├── 5m/
│   │   │   │   ├── range_predictor.joblib
│   │   │   │   ├── movement_predictor.joblib
│   │   │   │   └── config.yaml
│   │   │   └── 15m/
│   │   │       └── ...
│   │   ├── EURUSD/
│   │   │   └── ...
│   │   └── BTCUSDT/
│   │       └── ...
│   └── (existing models)
└── training/
    └── dynamic_factor_calculator.py  # (NEW)
```

### 4.2 symbols_config.yaml

```yaml
# Configuración centralizada de símbolos
# Factores iniciales para warm-start (se actualizan automáticamente)

symbols:
  XAUUSD:
    category: "commodity"
    decimal_places: 2
    pip_size: 0.01
    initial_factor: 5.0  # Solo para warmup
    factor_window: 200
    min_periods: 100

  EURUSD:
    category: "forex"
    decimal_places: 5
    pip_size: 0.0001
    initial_factor: 0.0003
    factor_window: 200
    min_periods: 100

  BTCUSDT:
    category: "crypto"
    decimal_places: 2
    pip_size: 0.01
    initial_factor: 200.0
    factor_window: 200
    min_periods: 100

  # ... más símbolos
```

### 4.3 DynamicFactorCalculator Class

```python
class DynamicFactorCalculator:
    """
    Calcula y mantiene factores dinámicos para todos los símbolos.

    Features:
    - Rolling median con shift(1) para evitar leakage
    - Cache de factores por símbolo
    - Actualización incremental (EMA)
    - Persistencia opcional en Redis/archivo
    """

    def __init__(self, config_path: str = "config/symbols_config.yaml"):
        self.config = self._load_config(config_path)
        self._factors: Dict[str, float] = {}  # Cache
        self._factor_history: Dict[str, List[float]] = {}

    def get_factor(self, symbol: str, df: pd.DataFrame = None) -> float:
        """Obtiene factor actual para un símbolo."""
        if df is not None:
            return self._compute_factor(symbol, df)
        return self._factors.get(symbol, self.config['symbols'][symbol]['initial_factor'])

    def _compute_factor(self, symbol: str, df: pd.DataFrame) -> float:
        """Calcula factor dinámico basado en mediana rolling."""
        cfg = self.config['symbols'].get(symbol, {})
        window = cfg.get('factor_window', 200)
        min_periods = cfg.get('min_periods', 100)

        range_col = df['High'] - df['Low']
        factor = range_col.rolling(window=window, min_periods=min_periods).median().iloc[-1]

        # Cache
        self._factors[symbol] = factor

        return factor

    def update_factor_incremental(self, symbol: str, new_range: float, alpha: float = 0.02):
        """Actualización incremental usando EMA."""
        current = self._factors.get(symbol)
        if current is None:
            self._factors[symbol] = new_range
        else:
            self._factors[symbol] = alpha * new_range + (1 - alpha) * current
```

---

## 5. INTEGRACIÓN CON MODELOS EXISTENTES

### 5.1 AttentionWeightedModel (Base Class)

```python
class AttentionWeightedModel(ABC):
    """
    Clase base para modelos con pesos de atención dinámicos.

    Subclases:
    - AttentionWeightedXGBoost
    - AttentionWeightedTransformer
    """

    def __init__(self, symbol: str, timeframe: str):
        self.symbol = symbol
        self.timeframe = timeframe
        self.factor_calculator = DynamicFactorCalculator()
        self.attention_config = VolatilityAttentionConfig()

    @abstractmethod
    def train(self, X, y, df_ohlcv: pd.DataFrame):
        """Entrena modelo con pesos de atención."""
        pass

    @abstractmethod
    def predict(self, X, df_ohlcv: pd.DataFrame):
        """Predice con factor dinámico actual."""
        pass

    def compute_sample_weights(self, df: pd.DataFrame) -> np.ndarray:
        """Calcula pesos de muestra basados en volatilidad."""
        factor = self.factor_calculator.get_factor(self.symbol, df)
        multiplier = compute_move_multiplier(df, factor)
        weights = weight_smooth(multiplier, w_max=3.0, beta=4.0)

        # Normalizar a mean=1
        valid_mask = ~np.isnan(weights)
        if weights[valid_mask].mean() > 0:
            weights[valid_mask] /= weights[valid_mask].mean()

        return weights
```

### 5.2 Modificación de RangePredictor

```python
# En range_predictor.py - train()

class RangePredictor(AttentionWeightedModel):

    def train(self, X, y_high, y_low, df_ohlcv: pd.DataFrame):
        # Calcular pesos de atención
        sample_weights = self.compute_sample_weights(df_ohlcv)

        # Entrenar modelo HIGH con pesos
        self.model_high = XGBRegressor(**self.config)
        self.model_high.fit(X, y_high, sample_weight=sample_weights)

        # Entrenar modelo LOW con pesos
        self.model_low = XGBRegressor(**self.config)
        self.model_low.fit(X, y_low, sample_weight=sample_weights)
```

---

## 6. SEPARACIÓN POR ACTIVO Y TEMPORALIDAD

### 6.1 Estructura de Modelos Entrenados

```
trained/
├── XAUUSD/
│   ├── 5m/
│   │   ├── config.yaml          # Hyperparámetros específicos
│   │   ├── factor_stats.json    # Estadísticas del factor
│   │   ├── range_predictor.joblib
│   │   ├── movement_predictor.joblib
│   │   └── attention_weights.npz
│   └── 15m/
│       └── ...
├── EURUSD/
│   ├── 5m/
│   └── 15m/
└── BTCUSDT/
    ├── 5m/
    └── 15m/
```

### 6.2 Config.yaml por Modelo

```yaml
# trained/XAUUSD/5m/config.yaml
symbol: XAUUSD
timeframe: 5m
prediction_horizon: 15m  # 3 velas de 5m

factor:
  computed_at: "2026-01-06T10:00:00Z"
  value: 4.85
  window: 200
  method: "rolling_median_shift1"

attention:
  w_max: 3.0
  beta: 4.0
  use_smooth: true

xgboost:
  n_estimators: 300
  max_depth: 6
  learning_rate: 0.03

training:
  samples: 50000
  train_period: "2021-01-01 to 2025-12-31"
  validation_split: 0.2

metrics:
  mae_high: 2.15
  mae_low: 1.98
  r2_high: 0.72
  r2_low: 0.75
```

---

## 7. API PARA MODELOS

### 7.1 Endpoint Unificado (Recomendado)

```
GET /api/ml/predictions/{symbol}
    ?timeframe=15m
    &include_models=range,movement,amd,attention

Response:
{
  "symbol": "XAUUSD",
  "timeframe": "15m",
  "timestamp": "2026-01-06T10:30:00Z",
  "current_price": 2655.50,
  "dynamic_factor": 4.85,
  "attention_weight": 1.8,

  "models": {
    "range_predictor": {
      "pred_high": 2658.5,
      "pred_low": 2652.0,
      "confidence": 0.72,
      "multiplier_high": 0.62,
      "multiplier_low": 0.72
    },
    "movement_predictor": {
      "high_usd": 8.5,
      "low_usd": 3.0,
      "asymmetry_ratio": 2.83,
      "direction": "LONG"
    },
    "amd_detector": {
      "phase": "ACCUMULATION",
      "confidence": 0.68,
      "next_phase_prob": {"manipulation": 0.25, "distribution": 0.07}
    },
    "attention_model": {
      "pred_high": 2659.0,
      "pred_low": 2651.5,
      "attention_score": 1.8
    }
  },

  "metamodel": {
    "direction": "LONG",
    "confidence": 0.75,
    "entry": 2655.50,
    "tp": 2658.7,
    "sl": 2651.8,
    "rr_ratio": 2.83,
    "reasoning": ["High asymmetry", "Accumulation phase", "Attention > 1.5"]
  }
}
```

### 7.2 Endpoint por Modelo Individual

```
GET /api/ml/models/{model_type}/{symbol}
    ?timeframe=15m

# model_type: range | movement | amd | attention | ensemble
```

---

## 8. FRONTEND: 2 PÁGINAS REQUERIDAS

### 8.1 Página "ML Realtime" (`/ml/realtime`)

**Propósito**: Visualización en tiempo real de predicciones ML

**Componentes**:
1. **Cards por Activo** (grid responsive)
   - Símbolo + precio actual
   - Factor dinámico actual
   - Attention weight visual (barra de color)
   - Dirección predicha (LONG/SHORT/NEUTRAL)
   - Niveles TP/SL
   - Confianza

2. **Filtros**:
   - Por símbolo (multi-select)
   - Por confianza mínima
   - Por attention weight mínimo
   - Solo señales activas

3. **Auto-refresh**: 30 segundos

### 8.2 Página "ML Historical" (`/ml/historical`)

**Propósito**: Análisis de predicciones pasadas sin refresh constante

**Componentes**:
1. **Date Range Picker** (inicio - fin)
2. **Selector de Símbolo**
3. **Selector de Modelo** (o todos)

4. **Tabla de Predicciones**:
   - Timestamp
   - Símbolo
   - Modelo
   - Predicción (High/Low)
   - Actual (High/Low)
   - Error (MAE)
   - Acierto (TP hit / SL hit / Neutral)

5. **Gráfico de Equity Curve** (basado en predicciones)

6. **Métricas Agregadas**:
   - Win Rate por modelo
   - MAE promedio
   - R² por período
   - Factor promedio usado

---

## 9. PLAN DE IMPLEMENTACIÓN

### Fase 1: Infraestructura (Prioridad ALTA)
- [ ] Crear `symbols_config.yaml`
- [ ] Implementar `DynamicFactorCalculator`
- [ ] Crear `AttentionWeightedModel` base class
- [ ] Tests unitarios

### Fase 2: Migración de Modelos
- [ ] Refactorizar `RangePredictor` para usar factores dinámicos
- [ ] Refactorizar `MovementMagnitudePredictor`
- [ ] Refactorizar `EnhancedRangePredictor`
- [ ] Tests de regresión

### Fase 3: Separación por Activo/Timeframe
- [ ] Crear estructura de directorios
- [ ] Script de migración de modelos existentes
- [ ] Pipeline de entrenamiento por activo
- [ ] Documentación

### Fase 4: API y Frontend
- [ ] Endpoint `/api/ml/predictions/{symbol}`
- [ ] Página MLRealtime
- [ ] Página MLHistorical
- [ ] Tests de integración

---

## 10. MÉTRICAS DE ÉXITO

| Métrica | Objetivo | Cómo medir |
|---------|----------|------------|
| Win Rate (movimientos fuertes) | ≥ 80% | Backtesting con attention > 1.5 |
| R:R Ratio promedio | ≥ 2:1 | Promedio de trades ejecutados |
| Tiempo de adaptación del factor | < 24h | Correlación factor vs volatilidad real |
| Latencia de predicción | < 100ms | API response time |
| Cobertura de activos | 100% de activos configurados | Símbolos con modelo entrenado |

---

## 11. RIESGOS Y MITIGACIONES

| Riesgo | Probabilidad | Impacto | Mitigación |
|--------|--------------|---------|------------|
| Factor dinámico lag | Media | Alto | Usar EMA para actualización incremental |
| Overfitting por activo | Media | Medio | Cross-validation + walk-forward |
| Pérdida de modelos | Baja | Alto | Versionado + backups |
| Incompatibilidad APIs | Baja | Medio | Tests de contrato |

---

*Documento generado: 2026-01-06*
*Pendiente de revisión: Vuelta 2*