---
id: "ET-ML-014"
title: "MTS (Multi-Timeframe Synthesis) Strategy"
type: "Technical Specification"
status: "Approved"
priority: "Alta"
epic: "OQI-006"
project: "trading-platform"
version: "1.0.0"
created_date: "2026-01-25"
updated_date: "2026-01-25"
task_reference: "TASK-2026-01-25-ML-TRAINING-ENHANCEMENT"
---

# ET-ML-014: MTS (Multi-Timeframe Synthesis) Strategy

## Metadata

| Campo | Valor |
|-------|-------|
| **ID** | ET-ML-014 |
| **Epica** | OQI-006 - Senales ML |
| **Tipo** | Especificacion Tecnica |
| **Version** | 1.0.0 |
| **Estado** | Aprobado |
| **Ultima actualizacion** | 2026-01-25 |
| **Tarea Referencia** | TASK-2026-01-25-ML-TRAINING-ENHANCEMENT |

---

## Resumen

La estrategia MTS (Multi-Timeframe Synthesis) sintetiza informacion de multiples timeframes usando una **Hierarchical Attention Network** para generar una representacion unificada y **XGBoost** para predicciones finales. El modelo calcula **alineamiento entre timeframes** y **scores de conflicto**.

### Caracteristicas Clave

- **4 Timeframes**: 5m, 15m, 1h, 4h agregados desde datos base
- **Hierarchical Attention**: Aprende pesos dinamicos por timeframe
- **Alignment Score**: Mide coherencia entre timeframes
- **Conflict Score**: Identifica divergencias entre timeframes

---

## Arquitectura

### Diagrama de Alto Nivel

```
┌─────────────────────────────────────────────────────────────────────────┐
│                           MTS MODEL                                      │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Input: 5m OHLCV Data (base timeframe)                                  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                 TIMEFRAME AGGREGATION                              │  │
│  │                                                                     │  │
│  │    5m Data ─┬──▶ 5m Features  ───────────────────────┐            │  │
│  │             │                                         │            │  │
│  │             ├──▶ 15m Aggregation ──▶ 15m Features ───┤            │  │
│  │             │                                         │            │  │
│  │             ├──▶ 1h Aggregation  ──▶ 1h Features  ───┤            │  │
│  │             │                                         │            │  │
│  │             └──▶ 4h Aggregation  ──▶ 4h Features  ───┘            │  │
│  │                                         │                          │  │
│  └─────────────────────────────────────────│──────────────────────────┘  │
│                                            ▼                             │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │              HIERARCHICAL ATTENTION NETWORK                        │  │
│  │                                                                     │  │
│  │  ┌─────────────────────────────────────────────────────────────┐  │  │
│  │  │                 Per-Timeframe Encoders                       │  │  │
│  │  │                                                               │  │  │
│  │  │  ┌─────┐    ┌─────┐    ┌─────┐    ┌─────┐                   │  │  │
│  │  │  │ 5m  │    │ 15m │    │ 1h  │    │ 4h  │                   │  │  │
│  │  │  │Enc. │    │Enc. │    │Enc. │    │Enc. │                   │  │  │
│  │  │  └──┬──┘    └──┬──┘    └──┬──┘    └──┬──┘                   │  │  │
│  │  │     │          │          │          │                        │  │  │
│  │  └─────│──────────│──────────│──────────│────────────────────────┘  │  │
│  │        │          │          │          │                            │  │
│  │        └──────────┼──────────┼──────────┘                            │  │
│  │                   ▼          ▼                                        │  │
│  │        ┌─────────────────────────────────┐                           │  │
│  │        │    Cross-Timeframe Attention    │                           │  │
│  │        │    (learns TF relationships)    │                           │  │
│  │        └─────────────────────────────────┘                           │  │
│  │                          │                                            │  │
│  │                          ▼                                            │  │
│  │        ┌─────────────────────────────────┐                           │  │
│  │        │    Timeframe Fusion Layer       │                           │  │
│  │        │    (learnable TF weights)       │                           │  │
│  │        └─────────────────────────────────┘                           │  │
│  │                          │                                            │  │
│  │                          ▼                                            │  │
│  │        ┌─────────────────────────────────┐                           │  │
│  │        │    Unified Representation       │                           │  │
│  │        │    (d_model = 128)              │                           │  │
│  │        └─────────────────────────────────┘                           │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                    XGBOOST HEADS                                   │  │
│  │                                                                     │  │
│  │  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐       │  │
│  │  │ Direction      │  │ Confidence     │  │ Optimal Entry  │       │  │
│  │  │ Classifier     │  │ Regressor      │  │ TF Classifier  │       │  │
│  │  │ (3 classes)    │  │ (0 to 1)       │  │ (4 classes)    │       │  │
│  │  └────────────────┘  └────────────────┘  └────────────────┘       │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                          │                                               │
│                          ▼                                               │
│  Output: MTSPrediction                                                   │
│          - unified_direction: -1 to 1                                    │
│          - confidence_by_alignment: 0 to 1                               │
│          - optimal_entry_tf: '5m' | '15m' | '1h' | '4h'                 │
│          - tf_contributions: Dict[str, float]                           │
└─────────────────────────────────────────────────────────────────────────┘
```

### Hierarchical Attention Config

```python
HierarchicalAttentionConfig:
    d_model: 128
    n_heads: 4
    d_ff: 256
    n_layers: 2
    dropout: 0.1
    max_seq_len: 256
    timeframes: ('5m', '15m', '1h', '4h')
    use_cross_tf_attention: True
    use_learnable_tf_weights: True
```

---

## Timeframe Aggregation

### Proceso de Agregacion

El modelo recibe datos en 5m y agrega a timeframes superiores:

```python
class MTSFeatureEngineer:
    def aggregate_to_timeframe(
        self,
        df_5m: pd.DataFrame,
        target_tf: str
    ) -> pd.DataFrame:
        """Agregar datos de 5m al timeframe objetivo"""

        # Mapeo de periodos
        periods = {
            '5m': 1,
            '15m': 3,
            '1h': 12,
            '4h': 48
        }

        n_periods = periods[target_tf]

        # Agregar OHLCV
        resampled = df_5m.resample(target_tf).agg({
            'open': 'first',
            'high': 'max',
            'low': 'min',
            'close': 'last',
            'volume': 'sum'
        })

        return resampled
```

### Features por Timeframe

Cada timeframe tiene su propio conjunto de features:

```python
Per-Timeframe Features:
    # Price-based
    returns_1: 1-period return
    returns_5: 5-period return
    returns_10: 10-period return

    # Volatility
    atr_14: ATR(14)
    volatility: rolling std of returns

    # Momentum
    rsi_14: RSI(14)
    macd: MACD line
    macd_signal: Signal line

    # Trend
    ema_8: EMA(8)
    ema_21: EMA(21)
    ema_55: EMA(55)
    trend_strength: EMA crossover metrics
```

---

## Alignment y Conflict Scores

### Alignment Score

Mide la coherencia direccional entre timeframes:

```python
def compute_alignment_score(
    tf_features: Dict[str, np.ndarray]
) -> float:
    """Calcular alineamiento entre timeframes"""

    # Extraer direccion de cada timeframe
    directions = {}
    for tf, features in tf_features.items():
        # Direction basada en trend indicators
        trend = np.sign(features['ema_8'] - features['ema_21'])
        momentum = np.sign(features['rsi_14'] - 50)
        macd_dir = np.sign(features['macd'])

        directions[tf] = np.mean([trend, momentum, macd_dir])

    # Calcular coherencia
    all_directions = list(directions.values())

    # Alignment = 1 si todos tienen misma direccion
    # Alignment = 0 si direcciones contradictorias
    alignment = 1 - np.std(all_directions)

    return np.clip(alignment, 0, 1)
```

### Conflict Score

Identifica divergencias significativas:

```python
def compute_conflict_score(
    tf_features: Dict[str, np.ndarray],
    hierarchy: List[str] = ['4h', '1h', '15m', '5m']
) -> float:
    """Calcular conflicto entre timeframes"""

    conflicts = []

    for i, higher_tf in enumerate(hierarchy[:-1]):
        for lower_tf in hierarchy[i+1:]:
            higher_dir = get_direction(tf_features[higher_tf])
            lower_dir = get_direction(tf_features[lower_tf])

            # Conflicto si direcciones opuestas
            if higher_dir * lower_dir < 0:
                # Mas peso si el conflicto es con TF mayor
                weight = 1.0 / (i + 1)
                conflicts.append(weight)

    conflict_score = sum(conflicts) / len(hierarchy) if conflicts else 0
    return np.clip(conflict_score, 0, 1)
```

### Interpretacion

| Alignment | Conflict | Interpretacion | Accion |
|-----------|----------|----------------|--------|
| > 0.8 | < 0.2 | Fuerte consenso | Trade con confianza |
| 0.5-0.8 | 0.2-0.5 | Consenso parcial | Trade con precaucion |
| < 0.5 | > 0.5 | Mercado mixto | Esperar o reducir size |
| < 0.3 | > 0.7 | Conflicto severo | No operar |

---

## Hierarchical Attention Network

### Per-Timeframe Encoder

Cada timeframe tiene su propio encoder:

```python
class TimeframeEncoder(nn.Module):
    def __init__(self, input_dim: int, d_model: int):
        super().__init__()
        self.input_proj = nn.Linear(input_dim, d_model)
        self.encoder = nn.TransformerEncoder(
            nn.TransformerEncoderLayer(
                d_model=d_model,
                nhead=4,
                dim_feedforward=d_model * 2,
                dropout=0.1
            ),
            num_layers=2
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.input_proj(x)
        return self.encoder(x)
```

### Cross-Timeframe Attention

Modela relaciones entre timeframes:

```python
class CrossTimeframeAttention(nn.Module):
    def __init__(self, d_model: int, n_heads: int, n_timeframes: int):
        super().__init__()
        self.attention = nn.MultiheadAttention(d_model, n_heads)

        # Embeddings para cada timeframe
        self.tf_embeddings = nn.Embedding(n_timeframes, d_model)

    def forward(
        self,
        tf_features: Dict[str, torch.Tensor]
    ) -> torch.Tensor:
        # Concatenar features de todos los TFs
        tf_list = list(tf_features.keys())
        features = torch.stack([tf_features[tf] for tf in tf_list])

        # Agregar TF embeddings
        tf_ids = torch.arange(len(tf_list))
        tf_embs = self.tf_embeddings(tf_ids)
        features = features + tf_embs.unsqueeze(1)

        # Cross-attention
        attended, weights = self.attention(features, features, features)

        return attended, weights
```

### Learnable Timeframe Weights

```python
class TimeframeFusion(nn.Module):
    def __init__(self, n_timeframes: int, d_model: int):
        super().__init__()

        # Pesos aprendibles por timeframe
        self.tf_weights = nn.Parameter(torch.ones(n_timeframes))

        # Fusion layer
        self.fusion = nn.Sequential(
            nn.Linear(d_model * n_timeframes, d_model * 2),
            nn.GELU(),
            nn.Dropout(0.1),
            nn.Linear(d_model * 2, d_model)
        )

    def forward(
        self,
        tf_features: List[torch.Tensor],
        alignment: torch.Tensor,
        conflict: torch.Tensor
    ) -> torch.Tensor:
        # Normalizar pesos
        weights = F.softmax(self.tf_weights, dim=0)

        # Ponderar features
        weighted_features = []
        for i, feat in enumerate(tf_features):
            weighted_features.append(feat * weights[i])

        # Ajustar por alignment (mas peso si hay consenso)
        alignment_factor = 0.5 + alignment * 0.5

        # Concatenar y fusionar
        concat = torch.cat(weighted_features, dim=-1)
        unified = self.fusion(concat) * alignment_factor

        return unified, weights
```

---

## Training Pipeline

### Fase 1: Feature Preparation

```python
model = MTSModel(config)

# Preparar datos multi-TF desde 5m base
tf_features, alignment, conflict = model._prepare_features(df_5m)

# Shape: {tf: (n_samples, seq_len, n_features)}
print(f"5m features: {tf_features['5m'].shape}")
print(f"1h features: {tf_features['1h'].shape}")
print(f"Alignment: {alignment.shape}")
print(f"Conflict: {conflict.shape}")
```

### Fase 2: Attention Training

```python
# Entrenar hierarchical attention
model.train_attention(
    train_data=train_dfs,  # List of 5m DataFrames
    val_data=val_dfs,
    epochs=50,
    batch_size=32,
    lr=0.001
)
```

### Fase 3: XGBoost Training

```python
# Extraer representaciones unificadas
unified_rep = model._extract_unified_representation(tf_features, alignment, conflict)

# Generar labels
y_direction = generate_direction_labels(df, forward_bars=12)
y_confidence = generate_confidence_targets(alignment, conflict)
y_entry_tf = determine_optimal_entry_tf(df, tf_features)

# Entrenar XGBoost
model.train_xgboost(
    X_train=unified_rep,
    y_direction=y_direction,
    y_confidence=y_confidence,
    y_entry_tf=y_entry_tf
)
```

### XGBoost Configuration

```python
MTSConfig:
    xgb_n_estimators: 200
    xgb_max_depth: 6
    xgb_learning_rate: 0.05
    xgb_subsample: 0.8
    xgb_colsample_bytree: 0.8
    xgb_min_child_weight: 5
    xgb_reg_alpha: 0.1
    xgb_reg_lambda: 1.0
```

---

## Metricas de Evaluacion

### Metricas de Direccion

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Direction Accuracy** | Precision direccional | >= 55% |
| **Direction F1** | F1 por clase | >= 0.50 |
| **Alignment-Weighted Accuracy** | Accuracy ponderada por alignment | >= 60% |

### Metricas de Confianza

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **Confidence MSE** | Error en prediccion de confianza | < 0.1 |
| **Calibration Score** | Alineacion confianza-accuracy | > 0.8 |

### Metricas Multi-TF

| Metrica | Descripcion | Target |
|---------|-------------|--------|
| **TF Contribution Stability** | Estabilidad de pesos TF | std < 0.1 |
| **Optimal TF Accuracy** | Precision en TF de entrada optimo | >= 50% |
| **Alignment Prediction** | Correlacion alignment predicho vs real | > 0.7 |

---

## API y Uso

### Clase Principal: MTSModel

```python
from models.strategies.mts import MTSModel, MTSConfig, MTSPrediction

# Configuracion
config = MTSConfig(
    timeframes=('5m', '15m', '1h', '4h'),
    d_model=128,
    n_heads=4,
    n_layers=2,
    use_cross_tf_attention=True,
    use_learnable_tf_weights=True
)

# Inicializar modelo
model = MTSModel(config, use_xgboost=True)

# Entrenar
model.train_attention(train_dfs, val_dfs, epochs=50)
model.train_xgboost(X_train, y_direction, y_confidence, y_entry_tf)

# Prediccion desde datos 5m
prediction = model.predict(df_5m)
print(f"Direction: {prediction.direction_class} ({prediction.unified_direction:.2f})")
print(f"Confidence: {prediction.confidence:.2f}")
print(f"Optimal Entry TF: {prediction.optimal_entry_tf}")
print(f"TF Contributions: {prediction.tf_contributions}")
```

### Clase MTSPrediction

```python
@dataclass
class MTSPrediction:
    unified_direction: float      # -1 to 1
    direction_class: str          # 'bullish', 'bearish', 'neutral'
    confidence: float             # 0 to 1
    confidence_by_alignment: float  # Based on TF alignment
    optimal_entry_tf: str         # '5m', '15m', '1h', '4h'
    tf_contributions: Dict[str, float]  # Weight per TF
    signal_strength: float        # 0 to 1
    recommended_action: str       # 'buy', 'sell', 'hold'

    def to_dict(self) -> Dict[str, Any]:
        return {
            'unified_direction': float(self.unified_direction),
            'direction_class': self.direction_class,
            'confidence': float(self.confidence),
            'confidence_by_alignment': float(self.confidence_by_alignment),
            'optimal_entry_tf': self.optimal_entry_tf,
            'tf_contributions': self.tf_contributions,
            'signal_strength': float(self.signal_strength),
            'recommended_action': self.recommended_action
        }
```

### Timeframe Contributions

```python
# Obtener contribucion de cada timeframe
contributions = model.get_tf_contributions()

# Ejemplo output:
# {
#     '5m': 0.15,   # Short-term noise
#     '15m': 0.25,  # Entry timing
#     '1h': 0.35,   # Main trend
#     '4h': 0.25    # Higher TF context
# }

# Interpretar: 1h es el TF mas importante para esta prediccion
```

---

## Estructura de Archivos

```
apps/ml-engine/src/models/strategies/mts/
├── __init__.py
├── model.py                  # MTSModel, MTSConfig, MTSPrediction
├── feature_engineering.py    # MTSFeatureEngineer (aggregation + features)
├── hierarchical_attention.py # HierarchicalAttention, CrossTFAttention
├── timeframe_fusion.py       # TimeframeFusion layer
└── trainer.py                # MTSTrainer
```

---

## Estrategias de Trading

### Entry Based on Alignment

```python
def generate_entry_signal(pred: MTSPrediction) -> Optional[Dict]:
    """Generar senal basada en MTS"""

    # Solo operar con alto alineamiento
    if pred.confidence_by_alignment < 0.6:
        return None

    # Direccion clara
    if abs(pred.unified_direction) < 0.3:
        return None

    direction = 'LONG' if pred.unified_direction > 0 else 'SHORT'

    return {
        'direction': direction,
        'entry_tf': pred.optimal_entry_tf,
        'confidence': pred.signal_strength,
        'tf_weights': pred.tf_contributions
    }
```

### Multi-TF Confirmation

```python
def check_tf_confirmation(pred: MTSPrediction) -> Dict:
    """Verificar confirmacion multi-TF"""

    contributions = pred.tf_contributions

    # Verificar que TFs mayores esten alineados
    higher_tf_weight = contributions['4h'] + contributions['1h']
    lower_tf_weight = contributions['15m'] + contributions['5m']

    # Preferir cuando TFs mayores tienen mas peso
    if higher_tf_weight > lower_tf_weight:
        quality = 'HIGH'
    else:
        quality = 'MEDIUM'

    return {
        'quality': quality,
        'higher_tf_weight': higher_tf_weight,
        'lower_tf_weight': lower_tf_weight,
        'dominant_tf': max(contributions, key=contributions.get)
    }
```

---

## Consideraciones de Produccion

### Real-Time Multi-TF Updates

```python
class MTSRealtime:
    def __init__(self, model_path: str):
        self.model = MTSModel.load(model_path)
        self.buffers = {
            '5m': deque(maxlen=500),
            '15m': deque(maxlen=200),
            '1h': deque(maxlen=100),
            '4h': deque(maxlen=50)
        }

    def on_candle_5m(self, candle: Dict) -> Optional[MTSPrediction]:
        self.buffers['5m'].append(candle)

        # Agregar a TFs superiores si es momento
        if self._is_15m_close():
            self._aggregate_to_tf('15m')
        if self._is_1h_close():
            self._aggregate_to_tf('1h')
        if self._is_4h_close():
            self._aggregate_to_tf('4h')

        # Predecir si tenemos suficientes datos
        if len(self.buffers['5m']) >= 100:
            df_5m = pd.DataFrame(list(self.buffers['5m']))
            return self.model.predict(df_5m)

        return None
```

### Caching de Aggregations

```python
# Cache TFs superiores (cambian menos frecuentemente)
@cached(ttl=60)  # Cache por 1 minuto
def get_cached_1h_features(symbol: str) -> np.ndarray:
    df = get_1h_data(symbol, bars=100)
    return compute_features(df)

@cached(ttl=240)  # Cache por 4 minutos
def get_cached_4h_features(symbol: str) -> np.ndarray:
    df = get_4h_data(symbol, bars=50)
    return compute_features(df)
```

### Performance

| Operacion | Tiempo | Notas |
|-----------|--------|-------|
| TF Aggregation | 10ms | 4 timeframes |
| Feature computation | 20ms | Per TF |
| Attention inference | 30ms | Hierarchical |
| XGBoost prediction | 5ms | 3 heads |
| **Total** | ~100ms | Full prediction |

---

## Referencias

- [ET-ML-001: Arquitectura ML Engine](./ET-ML-001-arquitectura.md)
- [ET-ML-010: PVA Strategy](./ET-ML-010-pva-strategy.md)
- [Hierarchical Attention Networks (Yang et al.)](https://www.cs.cmu.edu/~./hovy/papers/16HLT-hierarchical-attention-networks.pdf)
- [Multi-Scale Temporal Fusion](https://arxiv.org/abs/1912.09363)

---

**Autor:** ML-Specialist (NEXUS v4.0)
**Fecha:** 2026-01-25