rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment

ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-07 09:31:29 -06:00

58 KiB

Raw Blame History

id	title	type	project	version	updated_date
MODELOS-ML-DEFINICION	Arquitectura de Modelos ML - Trading Platform	Documentation	trading-platform	1.0.0	2026-01-04

Arquitectura de Modelos ML - Trading Platform

Versi\u00f3n: 1.0.0 Fecha: 2025-12-05 M\u00f3dulo: OQI-006-ml-signals Autor: Trading Strategist - Trading Platform

Visi\u00f3n General

Arquitectura del Sistema

┌─────────────────────────────────────────────────────────────────┐
│                    ORBIQUANT IA ML SYSTEM                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
│  │    AMD      │  │  Liquidity  │  │ OrderFlow   │             │
│  │  Detector   │  │   Hunter    │  │  Analyzer   │             │
│  │  (Phase)    │  │   (Hunt)    │  │  (Flow)     │             │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘             │
│         │                │                │                     │
│         └────────────────┼────────────────┘                     │
│                          │                                      │
│                          ▼                                      │
│                ┌─────────────────┐                              │
│                │  Feature Union  │                              │
│                │   (Combined)    │                              │
│                └────────┬────────┘                              │
│                         │                                       │
│         ┌───────────────┴───────────────┐                       │
│         │                               │                       │
│         ▼                               ▼                       │
│  ┌─────────────┐                 ┌─────────────┐               │
│  │   Range     │                 │    TPSL     │               │
│  │  Predictor  │                 │ Classifier  │               │
│  │ (ΔH/ΔL)     │                 │  (P[TP])    │               │
│  └──────┬──────┘                 └──────┬──────┘               │
│         │                               │                       │
│         └───────────────┬───────────────┘                       │
│                         │                                       │
│                         ▼                                       │
│                ┌─────────────────┐                              │
│                │   Strategy      │                              │
│                │ Orchestrator    │                              │
│                │ (Meta-Model)    │                              │
│                └────────┬────────┘                              │
│                         │                                       │
│                         ▼                                       │
│                ┌─────────────────┐                              │
│                │ Signal Output   │                              │
│                │ BUY/SELL/HOLD   │                              │
│                └─────────────────┘                              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Flujo de Datos

Market Data (OHLCV)
         │
         ▼
Feature Engineering
    (50+ features)
         │
         ├─────────────┬─────────────┬─────────────┐
         ▼             ▼             ▼             ▼
    AMDDetector  Liquidity   OrderFlow    Base
                  Hunter      Analyzer     Features
         │             │             │             │
         └─────────────┴─────────────┴─────────────┘
                              │
                              ▼
                    Combined Feature Vector
                         (100+ dims)
                              │
                ┌─────────────┴─────────────┐
                ▼                           ▼
         RangePredictor            TPSLClassifier
                │                           │
                └─────────────┬─────────────┘
                              ▼
                    StrategyOrchestrator
                              │
                              ▼
                      Trading Signal

Principios de Dise\u00f1o

Modular: Cada modelo es independiente y reutilizable
Escalable: F\u00e1cil agregar nuevos modelos
Interpretable: Feature importance y explicabilidad
Robusto: Validaci\u00f3n temporal estricta (no look-ahead bias)
Production-Ready: API, monitoring, retraining autom\u00e1tico

Modelo 1: AMDDetector

Descripci\u00f3n

Clasificador multiclass que identifica la fase actual del mercado seg\u00fan el framework AMD (Accumulation-Manipulation-Distribution).

Arquitectura

Tipo: XGBoost Multiclass Classifier Output: Probabilities para 4 clases

from xgboost import XGBClassifier

class AMDDetector:
    """
    Detecta fases AMD usando XGBoost
    """

    def __init__(self, config=None):
        self.config = config or self._default_config()
        self.model = self._init_model()
        self.scaler = RobustScaler()
        self.label_encoder = {
            0: 'neutral',
            1: 'accumulation',
            2: 'manipulation',
            3: 'distribution'
        }

    def _init_model(self):
        return XGBClassifier(
            objective='multi:softprob',
            num_class=4,
            n_estimators=300,
            max_depth=6,
            learning_rate=0.05,
            subsample=0.8,
            colsample_bytree=0.8,
            min_child_weight=5,
            gamma=0.2,
            reg_alpha=0.1,
            reg_lambda=1.0,
            scale_pos_weight=1.0,
            tree_method='hist',
            device='cuda',  # GPU support
            random_state=42
        )

Input Features

Dimensi\u00f3n: 50 features

Categor\u00eda	Features	Cantidad
Price Action	range_ratio, body_size, wicks, etc.	10
Volume	volume_ratio, trend, OBV, etc.	8
Volatility	ATR, volatility_*, percentiles	6
Trend	SMAs, slopes, strength	8
Market Structure	higher_highs, lower_lows, BOS, CHOCH	10
Order Flow	order_blocks, FVG, liquidity_grabs	8

def extract_amd_features(df):
    """
    Extrae features para AMDDetector
    """
    features = {}

    # Price action
    features['range_ratio'] = (df['high'] - df['low']) / df['high'].rolling(20).mean()
    features['body_size'] = abs(df['close'] - df['open']) / (df['high'] - df['low'])
    features['upper_wick'] = (df['high'] - df[['close', 'open']].max(axis=1)) / (df['high'] - df['low'])
    features['lower_wick'] = (df[['close', 'open']].min(axis=1) - df['low']) / (df['high'] - df['low'])
    features['buying_pressure'] = (df['close'] - df['low']) / (df['high'] - df['low'])
    features['selling_pressure'] = (df['high'] - df['close']) / (df['high'] - df['low'])

    # Volume
    features['volume_ratio'] = df['volume'] / df['volume'].rolling(20).mean()
    features['volume_trend'] = df['volume'].rolling(10).mean() - df['volume'].rolling(30).mean()
    features['obv'] = (df['volume'] * ((df['close'] > df['close'].shift(1)).astype(int) * 2 - 1)).cumsum()
    features['obv_slope'] = features['obv'].diff(5) / 5

    # Volatility
    features['atr'] = calculate_atr(df, 14)
    features['atr_ratio'] = features['atr'] / features['atr'].rolling(50).mean()
    features['volatility_10'] = df['close'].pct_change().rolling(10).std()
    features['volatility_20'] = df['close'].pct_change().rolling(20).std()

    # Trend
    features['sma_10'] = df['close'].rolling(10).mean()
    features['sma_20'] = df['close'].rolling(20).mean()
    features['sma_50'] = df['close'].rolling(50).mean()
    features['close_sma_ratio_20'] = df['close'] / features['sma_20']
    features['trend_slope'] = features['sma_20'].diff(5) / 5
    features['trend_strength'] = abs(features['trend_slope']) / features['atr']

    # Market structure
    features['higher_highs'] = (df['high'] > df['high'].shift(1)).rolling(10).sum()
    features['higher_lows'] = (df['low'] > df['low'].shift(1)).rolling(10).sum()
    features['lower_highs'] = (df['high'] < df['high'].shift(1)).rolling(10).sum()
    features['lower_lows'] = (df['low'] < df['low'].shift(1)).rolling(10).sum()

    # Order flow
    features['order_blocks_bullish'] = detect_order_blocks(df, 'bullish')
    features['order_blocks_bearish'] = detect_order_blocks(df, 'bearish')
    features['fvg_count_bullish'] = detect_fvg(df, 'bullish')
    features['fvg_count_bearish'] = detect_fvg(df, 'bearish')

    return pd.DataFrame(features)

Target Labeling

M\u00e9todo: Forward-looking con ventana de 20 periodos

def label_amd_phase(df, i, forward_window=20):
    """
    Etiqueta fase AMD basada en comportamiento futuro
    """
    if i + forward_window >= len(df):
        return 0  # neutral

    future = df.iloc[i:i+forward_window]
    current = df.iloc[i]

    # Calculate metrics
    price_range = (future['high'].max() - future['low'].min()) / current['close']
    volume_avg = future['volume'].mean()
    volume_std = future['volume'].std()
    price_end = future['close'].iloc[-1]
    price_start = current['close']

    # Accumulation criteria
    if price_range < 0.02:  # Tight range (< 2%)
        volume_declining = future['volume'].iloc[-5:].mean() < future['volume'].iloc[:5].mean()
        if volume_declining and price_end > price_start:
            return 1  # accumulation

    # Manipulation criteria
    false_breaks = count_false_breakouts(future)
    whipsaws = count_whipsaws(future)
    if false_breaks >= 2 or whipsaws >= 3:
        return 2  # manipulation

    # Distribution criteria
    if price_end < price_start * 0.98:  # Decline >= 2%
        volume_on_down = check_volume_on_down_moves(future)
        lower_highs = count_lower_highs(future)
        if volume_on_down and lower_highs >= 2:
            return 3  # distribution

    return 0  # neutral

Output

@dataclass
class AMDPrediction:
    phase: str                      # 'neutral', 'accumulation', etc.
    confidence: float               # 0-1
    probabilities: Dict[str, float] # {'neutral': 0.1, 'accumulation': 0.7, ...}
    strength: float                 # 0-1
    characteristics: Dict           # Phase-specific metrics
    timestamp: pd.Timestamp

# Ejemplo
prediction = amd_detector.predict(current_data)
# {
#   'phase': 'accumulation',
#   'confidence': 0.78,
#   'probabilities': {
#       'neutral': 0.05,
#       'accumulation': 0.78,
#       'manipulation': 0.12,
#       'distribution': 0.05
#   },
#   'strength': 0.71,
#   'timestamp': '2025-12-05 14:30:00'
# }

M\u00e9tricas de Evaluaci\u00f3n

M\u00e9trica	Target	Actual
Overall Accuracy	>70%	-
Accumulation Precision	>65%	-
Manipulation Precision	>60%	-
Distribution Precision	>65%	-
Macro F1 Score	>0.65	-
Weighted F1 Score	>0.70	-

Modelo 2: RangePredictor

Descripci\u00f3n

Modelo de regresi\u00f3n que predice delta_high y delta_low para m\u00faltiples horizontes temporales.

Ver implementaci\u00f3n existente: [LEGACY: apps/ml-engine - migrado desde TradingAgent]/src/models/range_predictor.py

Arquitectura

Tipo: XGBoost Regressor + Classifier (para bins) Horizontes: 15m (3 bars), 1h (12 bars), personalizado

class RangePredictor:
    """
    Predice rangos de precio futuros
    """

    def __init__(self, config=None):
        self.config = config or self._default_config()
        self.horizons = ['15m', '1h']
        self.models = {}

        # Initialize models for each horizon
        for horizon in self.horizons:
            self.models[f'{horizon}_high_reg'] = XGBRegressor(**self.config['xgboost'])
            self.models[f'{horizon}_low_reg'] = XGBRegressor(**self.config['xgboost'])
            self.models[f'{horizon}_high_bin'] = XGBClassifier(**self.config['xgboost_classifier'])
            self.models[f'{horizon}_low_bin'] = XGBClassifier(**self.config['xgboost_classifier'])

Input Features

Dimensi\u00f3n: 70+ features (base + AMD)

def prepare_range_features(df, amd_features):
    """
    Combina features base con outputs de AMDDetector
    """
    # Base technical features (21 existentes)
    base_features = extract_technical_features(df)

    # AMD features (del AMDDetector)
    amd_enhanced = {
        'phase_encoded': encode_phase(amd_features['phase']),
        'phase_accumulation_prob': amd_features['probabilities']['accumulation'],
        'phase_manipulation_prob': amd_features['probabilities']['manipulation'],
        'phase_distribution_prob': amd_features['probabilities']['distribution'],
        'phase_strength': amd_features['strength'],
        'range_compression': amd_features['characteristics'].get('range_compression', 0),
        'order_blocks_net': (
            amd_features['characteristics'].get('order_blocks_bullish', 0) -
            amd_features['characteristics'].get('order_blocks_bearish', 0)
        )
    }

    # Liquidity features (del LiquidityHunter)
    liquidity_features = {
        'bsl_distance': calculate_bsl_distance(df),
        'ssl_distance': calculate_ssl_distance(df),
        'liquidity_grab_recent': count_recent_liquidity_grabs(df),
        'fvg_count': count_unfilled_fvg(df)
    }

    # ICT features
    ict_features = {
        'ote_position': calculate_ote_position(df),
        'in_premium_zone': 1 if is_premium_zone(df) else 0,
        'in_discount_zone': 1 if is_discount_zone(df) else 0,
        'killzone_strength': get_killzone_strength(df),
        'weekly_range_position': calculate_weekly_position(df),
        'daily_range_position': calculate_daily_position(df)
    }

    # SMC features
    smc_features = {
        'choch_bullish_count': count_choch(df, 'bullish'),
        'choch_bearish_count': count_choch(df, 'bearish'),
        'bos_bullish_count': count_bos(df, 'bullish'),
        'bos_bearish_count': count_bos(df, 'bearish'),
        'displacement_strength': calculate_displacement(df),
        'market_structure_score': calculate_structure_score(df)
    }

    # Combine all
    return pd.DataFrame({
        **base_features,
        **amd_enhanced,
        **liquidity_features,
        **ict_features,
        **smc_features
    })

Targets

def calculate_range_targets(df, horizons={'15m': 3, '1h': 12}):
    """
    Calcula targets de rango para entrenamiento
    """
    targets = {}

    for name, periods in horizons.items():
        # Delta high/low
        targets[f'delta_high_{name}'] = (
            df['high'].rolling(periods).max().shift(-periods) - df['close']
        ) / df['close']

        targets[f'delta_low_{name}'] = (
            df['close'] - df['low'].rolling(periods).min().shift(-periods)
        ) / df['close']

        # Bins (clasificaci\u00f3n de volatilidad)
        atr = calculate_atr(df, 14)

        def to_bin(delta):
            if pd.isna(delta):
                return np.nan
            ratio = delta / atr
            if ratio < 0.3:
                return 0  # Very low
            elif ratio < 0.7:
                return 1  # Low
            elif ratio < 1.2:
                return 2  # Medium
            else:
                return 3  # High

        targets[f'bin_high_{name}'] = targets[f'delta_high_{name}'].apply(to_bin)
        targets[f'bin_low_{name}'] = targets[f'delta_low_{name}'].apply(to_bin)

    return pd.DataFrame(targets)

Output

@dataclass
class RangePrediction:
    horizon: str
    delta_high: float              # Predicted max price increase
    delta_low: float               # Predicted max price decrease
    delta_high_bin: int            # Volatility classification
    delta_low_bin: int
    confidence_high: float
    confidence_low: float
    predicted_high_price: float    # Absolute price
    predicted_low_price: float
    timestamp: pd.Timestamp

# Ejemplo
predictions = range_predictor.predict(features, current_price=89350)
# [
#   RangePrediction(
#       horizon='15m',
#       delta_high=0.0085,  # +0.85%
#       delta_low=0.0042,   # -0.42%
#       predicted_high_price=89,109,
#       predicted_low_price=88,975,
#       confidence_high=0.72,
#       confidence_low=0.68
#   ),
#   RangePrediction(horizon='1h', ...)
# ]

M\u00e9tricas de Evaluaci\u00f3n

Horizonte	MAE High	MAE Low	MAPE	Bin Accuracy	R²
15m	<0.003	<0.003	<0.5%	>65%	>0.3
1h	<0.005	<0.005	<0.8%	>60%	>0.2

Directional Accuracy:

High predictions: Target >95%
Low predictions: Target >50% (mejorar desde 4-19%)

Modelo 3: TPSLClassifier

Descripci\u00f3n

Clasificador binario que predice la probabilidad de que Take Profit sea alcanzado antes que Stop Loss.

Ver implementaci\u00f3n existente: [LEGACY: apps/ml-engine - migrado desde TradingAgent]/src/models/tp_sl_classifier.py

Arquitectura

Tipo: XGBoost Binary Classifier con calibraci\u00f3n R:R Configs: M\u00faltiples ratios (2:1, 3:1, personalizado)

class TPSLClassifier:
    """
    Predice probabilidad TP antes de SL
    """

    def __init__(self, config=None):
        self.config = config or self._default_config()
        self.horizons = ['15m', '1h']
        self.rr_configs = [
            {'name': 'rr_2_1', 'sl_atr_multiple': 0.3, 'tp_atr_multiple': 0.6},
            {'name': 'rr_3_1', 'sl_atr_multiple': 0.3, 'tp_atr_multiple': 0.9},
        ]
        self.models = {}
        self.calibrated_models = {}

        # Initialize models
        for horizon in self.horizons:
            for rr in self.rr_configs:
                key = f'{horizon}_{rr["name"]}'
                self.models[key] = XGBClassifier(**self.config['xgboost'])

Input Features

Dimensi\u00f3n: 80+ features (base + AMD + Range predictions)

def prepare_tpsl_features(df, amd_features, range_predictions):
    """
    Features para TPSLClassifier incluyen stacking
    """
    # Base + AMD features (igual que RangePredictor)
    base_features = prepare_range_features(df, amd_features)

    # Range predictions como features (stacking)
    range_stacking = {
        'pred_delta_high_15m': range_predictions['15m'].delta_high,
        'pred_delta_low_15m': range_predictions['15m'].delta_low,
        'pred_delta_high_1h': range_predictions['1h'].delta_high,
        'pred_delta_low_1h': range_predictions['1h'].delta_low,
        'pred_high_confidence': range_predictions['15m'].confidence_high,
        'pred_low_confidence': range_predictions['15m'].confidence_low,
        'pred_high_low_ratio': (
            range_predictions['15m'].delta_high /
            (range_predictions['15m'].delta_low + 1e-8)
        )
    }

    # R:R specific features
    rr_features = {
        'atr_current': calculate_atr(df, 14).iloc[-1],
        'volatility_regime': classify_volatility_regime(df),
        'trend_alignment': check_trend_alignment(df, amd_features),
        'liquidity_risk': calculate_liquidity_risk(df),
        'manipulation_risk': amd_features['probabilities']['manipulation']
    }

    return pd.DataFrame({
        **base_features,
        **range_stacking,
        **rr_features
    })

Targets

def calculate_tpsl_targets(df, horizons, rr_configs):
    """
    Calcula si TP toca antes de SL
    """
    targets = {}
    atr = calculate_atr(df, 14)

    for horizon_name, periods in horizons.items():
        for rr in rr_configs:
            sl_distance = atr * rr['sl_atr_multiple']
            tp_distance = atr * rr['tp_atr_multiple']

            target_name = f'tp_first_{horizon_name}_{rr["name"]}'

            def check_tp_first(i):
                if i + periods >= len(df):
                    return np.nan

                entry = df['close'].iloc[i]
                sl_price = entry - sl_distance.iloc[i]
                tp_price = entry + tp_distance.iloc[i]

                future = df.iloc[i+1:i+periods+1]

                # Check which hits first
                for j, row in future.iterrows():
                    if row['low'] <= sl_price:
                        return 0  # SL hit first
                    elif row['high'] >= tp_price:
                        return 1  # TP hit first

                return np.nan  # Neither hit

            targets[target_name] = [check_tp_first(i) for i in range(len(df))]

    return pd.DataFrame(targets)

Probability Calibration

from sklearn.calibration import CalibratedClassifierCV

def calibrate_model(model, X_val, y_val):
    """
    Calibra probabilidades usando isotonic regression
    """
    calibrated = CalibratedClassifierCV(
        model,
        method='isotonic',  # or 'sigmoid'
        cv='prefit'
    )
    calibrated.fit(X_val, y_val)
    return calibrated

# Uso
tpsl_classifier.models['15m_rr_2_1'].fit(X_train, y_train)
tpsl_classifier.calibrated_models['15m_rr_2_1'] = calibrate_model(
    tpsl_classifier.models['15m_rr_2_1'],
    X_val, y_val
)

Output

@dataclass
class TPSLPrediction:
    horizon: str
    rr_config: str
    prob_tp_first: float          # P(TP antes de SL)
    prob_sl_first: float          # 1 - prob_tp_first
    recommended_action: str       # 'long', 'short', 'hold'
    confidence: float             # |prob - 0.5| * 2
    entry_price: float
    sl_price: float
    tp_price: float
    expected_value: float         # EV calculation
    timestamp: pd.Timestamp

# Ejemplo
predictions = tpsl_classifier.predict(
    features,
    current_price=89350,
    direction='long'
)
# [
#   TPSLPrediction(
#       horizon='15m',
#       rr_config='rr_2_1',
#       prob_tp_first=0.68,
#       recommended_action='long',
#       confidence=0.36,
#       entry_price=89350,
#       sl_price=89082,  # -0.3 ATR
#       tp_price=89886,  # +0.6 ATR
#       expected_value=0.136  # +13.6% EV
#   )
# ]

M\u00e9tricas de Evaluaci\u00f3n

M\u00e9trica	Target	Actual (Phase 2)
Accuracy	>80%	85.9%
Precision	>75%	82.1%
Recall	>75%	85.7%
F1 Score	>0.75	0.84
ROC-AUC	>0.85	0.94

Modelo 4: LiquidityHunter

Descripci\u00f3n

Modelo especializado en detectar zonas de liquidez y predecir movimientos de "stop hunting".

Arquitectura

Tipo: XGBoost Binary Classifier Output: Probabilidad de liquidity sweep

class LiquidityHunter:
    """
    Detecta y predice caza de stops
    """

    def __init__(self, config=None):
        self.config = config or self._default_config()
        self.model_bsl = XGBClassifier(**self.config['xgboost'])  # Buy-side liquidity
        self.model_ssl = XGBClassifier(**self.config['xgboost'])  # Sell-side liquidity
        self.scaler = StandardScaler()

    def _default_config(self):
        return {
            'lookback_swing': 20,  # Periodos para swing points
            'sweep_threshold': 0.005,  # 0.5% beyond level
            'xgboost': {
                'n_estimators': 200,
                'max_depth': 5,
                'learning_rate': 0.05,
                'scale_pos_weight': 2.0,  # Liquidity sweeps son raros
                'objective': 'binary:logistic',
                'eval_metric': 'auc'
            }
        }

Input Features

Dimensi\u00f3n: 30 features especializados

def extract_liquidity_features(df, lookback=20):
    """
    Features para detecci\u00f3n de liquidez
    """
    features = {}

    # Identify liquidity pools
    swing_highs = df['high'].rolling(lookback, center=True).max()
    swing_lows = df['low'].rolling(lookback, center=True).min()

    # Distance to liquidity
    features['bsl_distance'] = (swing_highs - df['close']) / df['close']
    features['ssl_distance'] = (df['close'] - swing_lows) / df['close']

    # Liquidity density (how many levels nearby)
    features['bsl_density'] = count_levels_above(df, lookback)
    features['ssl_density'] = count_levels_below(df, lookback)

    # Recent sweep history
    features['bsl_sweeps_recent'] = count_bsl_sweeps(df, window=50)
    features['ssl_sweeps_recent'] = count_ssl_sweeps(df, window=50)

    # Volume profile near liquidity
    features['volume_at_bsl'] = calculate_volume_at_level(df, swing_highs)
    features['volume_at_ssl'] = calculate_volume_at_level(df, swing_lows)

    # Market structure
    features['higher_highs_forming'] = (df['high'] > df['high'].shift(1)).rolling(10).sum()
    features['lower_lows_forming'] = (df['low'] < df['low'].shift(1)).rolling(10).sum()

    # Volatility expansion (often precedes sweeps)
    atr = calculate_atr(df, 14)
    features['atr_expanding'] = (atr > atr.shift(5)).astype(int)
    features['volatility_regime'] = classify_volatility(df)

    # Price proximity to levels
    features['near_bsl'] = (features['bsl_distance'] < 0.01).astype(int)  # Within 1%
    features['near_ssl'] = (features['ssl_distance'] < 0.01).astype(int)

    # Time since last sweep
    features['bars_since_bsl_sweep'] = calculate_bars_since_sweep(df, 'bsl')
    features['bars_since_ssl_sweep'] = calculate_bars_since_sweep(df, 'ssl')

    # Manipulation signals
    features['false_breakouts_recent'] = count_false_breakouts(df, window=30)
    features['whipsaw_intensity'] = calculate_whipsaw_intensity(df)

    # AMD phase context
    features['in_manipulation_phase'] = check_manipulation_phase(df)

    return pd.DataFrame(features)

Targets

def label_liquidity_sweep(df, i, forward_window=10, sweep_threshold=0.005):
    """
    Etiqueta si habr\u00e1 liquidity sweep
    """
    if i + forward_window >= len(df):
        return np.nan

    current_high = df['high'].iloc[max(0, i-20):i].max()
    current_low = df['low'].iloc[max(0, i-20):i].min()

    future = df.iloc[i:i+forward_window]

    # BSL sweep (sweep of highs)
    bsl_sweep_price = current_high * (1 + sweep_threshold)
    bsl_swept = (future['high'] >= bsl_sweep_price).any()

    # SSL sweep (sweep of lows)
    ssl_sweep_price = current_low * (1 - sweep_threshold)
    ssl_swept = (future['low'] <= ssl_sweep_price).any()

    # Return binary targets
    return {
        'bsl_sweep': 1 if bsl_swept else 0,
        'ssl_sweep': 1 if ssl_swept else 0,
        'any_sweep': 1 if (bsl_swept or ssl_swept) else 0
    }

Output

@dataclass
class LiquidityPrediction:
    liquidity_type: str           # 'BSL' or 'SSL'
    sweep_probability: float      # 0-1
    liquidity_level: float        # Price level
    distance_pct: float           # Distance to level
    density: int                  # Number of levels nearby
    expected_timing: int          # Bars until sweep
    risk_score: float             # Higher = more likely to be trapped
    timestamp: pd.Timestamp

# Ejemplo
prediction = liquidity_hunter.predict(current_data)
# [
#   LiquidityPrediction(
#       liquidity_type='BSL',
#       sweep_probability=0.72,
#       liquidity_level=89450,
#       distance_pct=0.0011,  # 0.11% away
#       density=3,
#       expected_timing=5,  # ~5 bars
#       risk_score=0.68  # High risk of reversal after sweep
#   )
# ]

M\u00e9tricas

M\u00e9trica	Target
Precision	>70%
Recall	>60%
ROC-AUC	>0.75
False Positive Rate	<30%

Modelo 5: OrderFlowAnalyzer

Descripci\u00f3n

Analiza el flujo de \u00f3rdenes para detectar acumulaci\u00f3n/distribuci\u00f3n institucional.

Nota: Modelo opcional - requiere datos de volumen granular

Arquitectura

Tipo: LSTM / Transformer (para secuencias temporales) Output: Score de acumulaci\u00f3n/distribuci\u00f3n

import torch
import torch.nn as nn

class OrderFlowAnalyzer(nn.Module):
    """
    Analiza flujo de \u00f3rdenes usando LSTM
    """

    def __init__(self, input_dim=10, hidden_dim=64, num_layers=2):
        super().__init__()
        self.lstm = nn.LSTM(
            input_dim,
            hidden_dim,
            num_layers,
            batch_first=True,
            dropout=0.2
        )
        self.fc = nn.Sequential(
            nn.Linear(hidden_dim, 32),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(32, 3)  # accumulation, neutral, distribution
        )

    def forward(self, x):
        # x shape: (batch, sequence, features)
        lstm_out, _ = self.lstm(x)
        # Take last output
        last_out = lstm_out[:, -1, :]
        output = self.fc(last_out)
        return torch.softmax(output, dim=1)

Input Features (Secuencia)

Dimensi\u00f3n: 10 features x 50 timesteps

def extract_order_flow_sequence(df, sequence_length=50):
    """
    Extrae secuencia de order flow features
    """
    features = []

    for i in range(len(df) - sequence_length + 1):
        window = df.iloc[i:i+sequence_length]

        sequence_features = {
            # Delta de volumen
            'volume_delta': window['volume'] - window['volume'].shift(1),

            # Buy/Sell imbalance
            'buy_volume': window['volume'] * (window['close'] > window['open']).astype(int),
            'sell_volume': window['volume'] * (window['close'] < window['open']).astype(int),
            'imbalance': (window['buy_volume'] - window['sell_volume']) / window['volume'],

            # Large orders detection
            'large_orders': (window['volume'] > window['volume'].rolling(20).mean() * 2).astype(int),

            # Tick data (si disponible)
            'upticks': count_upticks(window),
            'downticks': count_downticks(window),
            'tick_imbalance': (window['upticks'] - window['downticks']) / (window['upticks'] + window['downticks'] + 1),

            # Cumulative metrics
            'cumulative_delta': (window['buy_volume'] - window['sell_volume']).cumsum(),
            'cvd_slope': window['cumulative_delta'].diff(5) / 5
        }

        features.append(pd.DataFrame(sequence_features))

    return np.array([f.values for f in features])

Output

@dataclass
class OrderFlowPrediction:
    flow_type: str                # 'accumulation', 'distribution', 'neutral'
    confidence: float
    imbalance_score: float        # -1 (selling) to +1 (buying)
    institutional_activity: float # 0-1
    large_orders_detected: int
    cvd_trend: str                # 'up', 'down', 'flat'
    timestamp: pd.Timestamp

Meta-Modelo: StrategyOrchestrator

Descripci\u00f3n

Combina todos los modelos anteriores para generar la se\u00f1al final de trading.

Arquitectura

Tipo: Ensemble Weighted + Rule-Based

class StrategyOrchestrator:
    """
    Meta-modelo que orquesta todas las predicciones
    """

    def __init__(self, models, config=None):
        self.amd_detector = models['amd_detector']
        self.range_predictor = models['range_predictor']
        self.tpsl_classifier = models['tpsl_classifier']
        self.liquidity_hunter = models['liquidity_hunter']
        self.order_flow_analyzer = models.get('order_flow_analyzer')

        self.config = config or self._default_config()
        self.weights = self.config['weights']

    def _default_config(self):
        return {
            'weights': {
                'amd': 0.30,
                'range': 0.25,
                'tpsl': 0.25,
                'liquidity': 0.15,
                'order_flow': 0.05
            },
            'min_confidence': 0.60,
            'min_tp_probability': 0.55,
            'risk_multiplier': 0.02  # 2% risk per trade
        }

    def generate_signal(self, market_data, current_price):
        """
        Genera se\u00f1al de trading combinando todos los modelos
        """
        signal = {
            'action': 'hold',
            'confidence': 0.0,
            'entry_price': current_price,
            'stop_loss': None,
            'take_profit': None,
            'position_size': 0.0,
            'reasoning': [],
            'model_outputs': {}
        }

        # 1. AMD Phase
        amd_pred = self.amd_detector.predict(market_data)
        signal['model_outputs']['amd'] = amd_pred

        if amd_pred['confidence'] < 0.6:
            signal['reasoning'].append('Low AMD confidence - avoiding trade')
            return signal

        # 2. Range Prediction
        range_pred = self.range_predictor.predict(market_data, current_price)
        signal['model_outputs']['range'] = range_pred

        # 3. TPSL Probability
        tpsl_pred = self.tpsl_classifier.predict(market_data, current_price)
        signal['model_outputs']['tpsl'] = tpsl_pred

        # 4. Liquidity Analysis
        liq_pred = self.liquidity_hunter.predict(market_data)
        signal['model_outputs']['liquidity'] = liq_pred

        # 5. Order Flow (if available)
        if self.order_flow_analyzer:
            flow_pred = self.order_flow_analyzer.predict(market_data)
            signal['model_outputs']['order_flow'] = flow_pred

        # === DECISION LOGIC ===

        # Determine bias from AMD
        if amd_pred['phase'] == 'accumulation':
            bias = 'bullish'
            signal['reasoning'].append(f'AMD: Accumulation phase (conf: {amd_pred["confidence"]:.2%})')
        elif amd_pred['phase'] == 'distribution':
            bias = 'bearish'
            signal['reasoning'].append(f'AMD: Distribution phase (conf: {amd_pred["confidence"]:.2%})')
        elif amd_pred['phase'] == 'manipulation':
            signal['reasoning'].append('AMD: Manipulation phase - avoiding entry')
            return signal
        else:
            signal['reasoning'].append('AMD: Neutral phase - no clear direction')
            return signal

        # Check range prediction alignment
        if bias == 'bullish':
            range_alignment = range_pred['15m'].delta_high > range_pred['15m'].delta_low * 1.5
        else:
            range_alignment = range_pred['15m'].delta_low > range_pred['15m'].delta_high * 1.5

        if not range_alignment:
            signal['reasoning'].append('Range prediction does not align with bias')
            return signal

        signal['reasoning'].append('Range prediction aligned')

        # Check TPSL probability
        relevant_tpsl = [p for p in tpsl_pred if p.recommended_action == bias.replace('ish', '')]
        if not relevant_tpsl or relevant_tpsl[0].prob_tp_first < self.config['min_tp_probability']:
            signal['reasoning'].append(f'Low TP probability: {relevant_tpsl[0].prob_tp_first:.2%}')
            return signal

        signal['reasoning'].append(f'High TP probability: {relevant_tpsl[0].prob_tp_first:.2%}')

        # Check liquidity risk
        if liq_pred:
            liquidity_risk = any(p.sweep_probability > 0.7 and p.distance_pct < 0.005 for p in liq_pred)
            if liquidity_risk:
                signal['reasoning'].append('High liquidity sweep risk nearby')
                # Reduce position size
                position_multiplier = 0.5
            else:
                position_multiplier = 1.0
        else:
            position_multiplier = 1.0

        # === CALCULATE CONFIDENCE ===
        confidence_score = 0.0

        # AMD contribution
        confidence_score += self.weights['amd'] * amd_pred['confidence']

        # Range contribution
        range_conf = (range_pred['15m'].confidence_high + range_pred['15m'].confidence_low) / 2
        confidence_score += self.weights['range'] * range_conf

        # TPSL contribution
        tpsl_conf = relevant_tpsl[0].confidence
        confidence_score += self.weights['tpsl'] * tpsl_conf

        # Liquidity contribution
        if liq_pred:
            liq_conf = 1 - max(p.risk_score for p in liq_pred)  # Inverse of risk
            confidence_score += self.weights['liquidity'] * liq_conf

        signal['confidence'] = confidence_score

        if confidence_score < self.config['min_confidence']:
            signal['reasoning'].append(f'Overall confidence too low: {confidence_score:.2%}')
            return signal

        # === GENERATE ENTRY ===
        signal['action'] = 'long' if bias == 'bullish' else 'short'
        signal['entry_price'] = current_price

        # Use TPSL predictions
        tpsl_entry = relevant_tpsl[0]
        signal['stop_loss'] = tpsl_entry.sl_price
        signal['take_profit'] = tpsl_entry.tp_price

        # Calculate position size
        account_risk = self.config['risk_multiplier']  # 2% of account
        price_risk = abs(current_price - tpsl_entry.sl_price) / current_price
        signal['position_size'] = (account_risk / price_risk) * position_multiplier

        signal['reasoning'].append(f'Signal generated: {signal["action"].upper()}')
        signal['reasoning'].append(f'Confidence: {confidence_score:.2%}')
        signal['reasoning'].append(f'R:R: {(abs(tpsl_entry.tp_price - current_price) / abs(current_price - tpsl_entry.sl_price)):.2f}:1')

        return signal

Pipeline de Decisi\u00f3n

Market Data
     │
     ▼
┌─────────────┐
│ AMDDetector │──── Phase = Accumulation? ──────┐
└─────────────┘     Confidence > 0.6?           │
                                                 NO → HOLD
                                                YES
                                                 │
                                                 ▼
                                        ┌─────────────────┐
                                        │ RangePredictor  │
                                        └────────┬────────┘
                                                 │
                                        ΔHigh > ΔLow * 1.5?
                                                 │
                                                YES
                                                 │
                                                 ▼
                                        ┌─────────────────┐
                                        │TPSLClassifier   │
                                        └────────┬────────┘
                                                 │
                                        P(TP first) > 0.55?
                                                 │
                                                YES
                                                 │
                                                 ▼
                                        ┌─────────────────┐
                                        │LiquidityHunter  │
                                        └────────┬────────┘
                                                 │
                                        Sweep risk low?
                                                 │
                                                YES
                                                 │
                                                 ▼
                                        ┌─────────────────┐
                                        │ Confidence      │
                                        │ Calculation     │
                                        └────────┬────────┘
                                                 │
                                        Total > 0.60?
                                                 │
                                                YES
                                                 │
                                                 ▼
                                        ┌─────────────────┐
                                        │  LONG SIGNAL    │
                                        │  Entry, SL, TP  │
                                        └─────────────────┘

Output

@dataclass
class TradingSignal:
    action: str                   # 'long', 'short', 'hold'
    confidence: float             # 0-1
    entry_price: float
    stop_loss: float
    take_profit: float
    position_size: float          # Units or % of account
    risk_reward_ratio: float
    expected_value: float         # EV calculation
    reasoning: List[str]          # Why this signal
    model_outputs: Dict           # All model predictions
    timestamp: pd.Timestamp

    # Metadata
    symbol: str
    horizon: str
    amd_phase: str
    killzone: str

# Ejemplo completo
signal = orchestrator.generate_signal(market_data, current_price=89350)
# TradingSignal(
#     action='long',
#     confidence=0.73,
#     entry_price=89350,
#     stop_loss=89082,
#     take_profit=89886,
#     position_size=0.15,  # 15% of account
#     risk_reward_ratio=2.0,
#     expected_value=0.214,  # +21.4% EV
#     reasoning=[
#         'AMD: Accumulation phase (conf: 78%)',
#         'Range prediction aligned',
#         'High TP probability: 68%',
#         'Signal generated: LONG',
#         'Confidence: 73%',
#         'R:R: 2.00:1'
#     ],
#     amd_phase='accumulation',
#     killzone='ny_am'
# )

Pipeline de Entrenamiento

Workflow Completo

class MLTrainingPipeline:
    """
    Pipeline completo de entrenamiento
    """

    def __init__(self, data_path, config):
        self.data_path = data_path
        self.config = config
        self.models = {}

    def run(self):
        """Ejecuta pipeline completo"""

        # 1. Load & prepare data
        print("1. Loading data...")
        df = self.load_data()

        # 2. Feature engineering
        print("2. Engineering features...")
        features = self.engineer_features(df)

        # 3. Target labeling
        print("3. Labeling targets...")
        targets = self.label_targets(df)

        # 4. Train-test split (temporal)
        print("4. Splitting data...")
        X_train, X_val, X_test, y_train, y_val, y_test = self.temporal_split(
            features, targets
        )

        # 5. Train AMDDetector
        print("5. Training AMDDetector...")
        self.models['amd_detector'] = self.train_amd_detector(
            X_train, y_train['amd'], X_val, y_val['amd']
        )

        # 6. Generate AMD features for next models
        print("6. Generating AMD features...")
        amd_features_train = self.models['amd_detector'].predict_proba(X_train)
        amd_features_val = self.models['amd_detector'].predict_proba(X_val)

        # 7. Train RangePredictor
        print("7. Training RangePredictor...")
        X_range_train = np.hstack([X_train, amd_features_train])
        X_range_val = np.hstack([X_val, amd_features_val])

        self.models['range_predictor'] = self.train_range_predictor(
            X_range_train, y_train['range'], X_range_val, y_val['range']
        )

        # 8. Generate range predictions for TPSL
        print("8. Generating range predictions...")
        range_preds_train = self.models['range_predictor'].predict(X_range_train)
        range_preds_val = self.models['range_predictor'].predict(X_range_val)

        # 9. Train TPSLClassifier
        print("9. Training TPSLClassifier...")
        X_tpsl_train = np.hstack([X_range_train, range_preds_train])
        X_tpsl_val = np.hstack([X_range_val, range_preds_val])

        self.models['tpsl_classifier'] = self.train_tpsl_classifier(
            X_tpsl_train, y_train['tpsl'], X_tpsl_val, y_val['tpsl']
        )

        # 10. Train LiquidityHunter
        print("10. Training LiquidityHunter...")
        self.models['liquidity_hunter'] = self.train_liquidity_hunter(
            X_train, y_train['liquidity'], X_val, y_val['liquidity']
        )

        # 11. Evaluate all models
        print("11. Evaluating models...")
        self.evaluate_all(X_test, y_test)

        # 12. Save models
        print("12. Saving models...")
        self.save_all_models()

        print("Training complete!")
        return self.models

    def temporal_split(self, features, targets, train_pct=0.7, val_pct=0.15):
        """Split temporal (sin shuffle)"""
        n = len(features)
        train_end = int(n * train_pct)
        val_end = int(n * (train_pct + val_pct))

        return (
            features[:train_end],
            features[train_end:val_end],
            features[val_end:],
            targets[:train_end],
            targets[train_end:val_end],
            targets[val_end:]
        )

Cross-Validation Temporal

from sklearn.model_selection import TimeSeriesSplit

def temporal_cross_validation(model, X, y, n_splits=5):
    """
    Cross-validation respetando orden temporal
    """
    tscv = TimeSeriesSplit(n_splits=n_splits)
    scores = []

    for fold, (train_idx, val_idx) in enumerate(tscv.split(X)):
        print(f"Fold {fold + 1}/{n_splits}")

        X_train, X_val = X[train_idx], X[val_idx]
        y_train, y_val = y[train_idx], y[val_idx]

        # Train
        model.fit(X_train, y_train)

        # Evaluate
        y_pred = model.predict(X_val)
        score = accuracy_score(y_val, y_pred)
        scores.append(score)

        print(f"  Accuracy: {score:.4f}")

    print(f"\nMean Accuracy: {np.mean(scores):.4f} ± {np.std(scores):.4f}")
    return scores

M\u00e9tricas y Evaluaci\u00f3n

M\u00e9tricas por Modelo

class ModelEvaluator:
    """
    Evaluaci\u00f3n completa de modelos
    """

    @staticmethod
    def evaluate_amd_detector(model, X_test, y_test):
        """Evaluar AMDDetector"""
        y_pred = model.predict(X_test)
        y_pred_proba = model.predict_proba(X_test)

        metrics = {
            'accuracy': accuracy_score(y_test, y_pred),
            'macro_f1': f1_score(y_test, y_pred, average='macro'),
            'weighted_f1': f1_score(y_test, y_pred, average='weighted'),
            'classification_report': classification_report(y_test, y_pred),
            'confusion_matrix': confusion_matrix(y_test, y_pred)
        }

        # Per-class metrics
        for class_idx, class_name in model.label_encoder.items():
            mask = y_test == class_idx
            if mask.sum() > 0:
                metrics[f'{class_name}_precision'] = precision_score(
                    y_test == class_idx, y_pred == class_idx
                )
                metrics[f'{class_name}_recall'] = recall_score(
                    y_test == class_idx, y_pred == class_idx
                )

        return metrics

    @staticmethod
    def evaluate_range_predictor(model, X_test, y_test):
        """Evaluar RangePredictor"""
        predictions = model.predict(X_test)

        metrics = {}
        for horizon in ['15m', '1h']:
            for target_type in ['high', 'low']:
                y_true = y_test[f'delta_{target_type}_{horizon}']
                y_pred = [p.delta_high if target_type == 'high' else p.delta_low
                          for p in predictions if p.horizon == horizon]

                metrics[f'{horizon}_{target_type}_mae'] = mean_absolute_error(y_true, y_pred)
                metrics[f'{horizon}_{target_type}_rmse'] = np.sqrt(mean_squared_error(y_true, y_pred))
                metrics[f'{horizon}_{target_type}_r2'] = r2_score(y_true, y_pred)

                # Directional accuracy
                direction_true = np.sign(y_true)
                direction_pred = np.sign(y_pred)
                metrics[f'{horizon}_{target_type}_directional_acc'] = (
                    direction_true == direction_pred
                ).mean()

        return metrics

    @staticmethod
    def evaluate_tpsl_classifier(model, X_test, y_test):
        """Evaluar TPSLClassifier"""
        metrics = {}

        for horizon in ['15m', '1h']:
            for rr in ['rr_2_1', 'rr_3_1']:
                target_key = f'tp_first_{horizon}_{rr}'
                y_true = y_test[target_key].dropna()

                if len(y_true) == 0:
                    continue

                X_valid = X_test[y_test[target_key].notna()]

                y_pred = model.predict_proba(X_valid, horizon, rr)
                y_pred_class = (y_pred > 0.5).astype(int)

                metrics[f'{horizon}_{rr}_accuracy'] = accuracy_score(y_true, y_pred_class)
                metrics[f'{horizon}_{rr}_roc_auc'] = roc_auc_score(y_true, y_pred)
                metrics[f'{horizon}_{rr}_precision'] = precision_score(y_true, y_pred_class)
                metrics[f'{horizon}_{rr}_recall'] = recall_score(y_true, y_pred_class)
                metrics[f'{horizon}_{rr}_f1'] = f1_score(y_true, y_pred_class)

        return metrics

Backtesting de Señales

class SignalBacktester:
    """
    Backtesting de se\u00f1ales generadas
    """

    def __init__(self, initial_capital=10000):
        self.initial_capital = initial_capital
        self.capital = initial_capital
        self.trades = []
        self.equity_curve = []

    def run(self, df, signals):
        """Ejecuta backtest"""
        position = None

        for i, signal in enumerate(signals):
            if signal['action'] == 'hold':
                continue

            # Entry
            if position is None and signal['action'] in ['long', 'short']:
                position = {
                    'type': signal['action'],
                    'entry_price': signal['entry_price'],
                    'entry_time': signal['timestamp'],
                    'stop_loss': signal['stop_loss'],
                    'take_profit': signal['take_profit'],
                    'size': signal['position_size']
                }

            # Check exit
            if position is not None:
                # Simulate price movement
                future_bars = df[df.index > signal['timestamp']].head(100)

                for idx, row in future_bars.iterrows():
                    # Check SL
                    if position['type'] == 'long' and row['low'] <= position['stop_loss']:
                        self._close_position(position, position['stop_loss'], idx, 'SL')
                        position = None
                        break

                    # Check TP
                    elif position['type'] == 'long' and row['high'] >= position['take_profit']:
                        self._close_position(position, position['take_profit'], idx, 'TP')
                        position = None
                        break

            self.equity_curve.append(self.capital)

        return self._calculate_metrics()

    def _close_position(self, position, exit_price, exit_time, exit_reason):
        """Cierra posici\u00f3n"""
        if position['type'] == 'long':
            pnl = (exit_price - position['entry_price']) / position['entry_price']
        else:
            pnl = (position['entry_price'] - exit_price) / position['entry_price']

        pnl_amount = self.capital * position['size'] * pnl
        self.capital += pnl_amount

        self.trades.append({
            'type': position['type'],
            'entry_price': position['entry_price'],
            'exit_price': exit_price,
            'entry_time': position['entry_time'],
            'exit_time': exit_time,
            'exit_reason': exit_reason,
            'pnl_pct': pnl * 100,
            'pnl_amount': pnl_amount
        })

    def _calculate_metrics(self):
        """Calcula m\u00e9tricas de performance"""
        if not self.trades:
            return {}

        trades_df = pd.DataFrame(self.trades)

        total_return = (self.capital - self.initial_capital) / self.initial_capital
        num_trades = len(trades_df)
        num_wins = (trades_df['pnl_pct'] > 0).sum()
        num_losses = (trades_df['pnl_pct'] < 0).sum()
        win_rate = num_wins / num_trades if num_trades > 0 else 0

        avg_win = trades_df[trades_df['pnl_pct'] > 0]['pnl_pct'].mean() if num_wins > 0 else 0
        avg_loss = trades_df[trades_df['pnl_pct'] < 0]['pnl_pct'].mean() if num_losses > 0 else 0

        # Sharpe ratio
        returns = pd.Series(self.equity_curve).pct_change().dropna()
        sharpe = np.sqrt(252) * (returns.mean() / returns.std()) if returns.std() > 0 else 0

        # Max drawdown
        equity_series = pd.Series(self.equity_curve)
        cummax = equity_series.cummax()
        drawdown = (equity_series - cummax) / cummax
        max_drawdown = drawdown.min()

        return {
            'total_return_pct': total_return * 100,
            'final_capital': self.capital,
            'num_trades': num_trades,
            'num_wins': num_wins,
            'num_losses': num_losses,
            'win_rate': win_rate * 100,
            'avg_win_pct': avg_win,
            'avg_loss_pct': avg_loss,
            'profit_factor': abs(avg_win * num_wins / (avg_loss * num_losses)) if num_losses > 0 else np.inf,
            'sharpe_ratio': sharpe,
            'max_drawdown_pct': max_drawdown * 100
        }

Producci\u00f3n y Deployment

FastAPI Service

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI(title="Trading Platform ML Service")

# Load models
orchestrator = StrategyOrchestrator.load('models/orchestrator_v1.pkl')

class PredictionRequest(BaseModel):
    symbol: str
    timeframe: str = '5m'
    include_reasoning: bool = True

class PredictionResponse(BaseModel):
    signal: TradingSignal
    metadata: Dict

@app.post("/api/signal")
async def get_trading_signal(request: PredictionRequest):
    """
    Genera se\u00f1al de trading
    """
    try:
        # Fetch market data
        market_data = fetch_market_data(request.symbol, request.timeframe)

        # Generate signal
        signal = orchestrator.generate_signal(
            market_data,
            current_price=market_data['close'].iloc[-1]
        )

        return PredictionResponse(
            signal=signal,
            metadata={
                'model_version': '1.0.0',
                'latency_ms': 45,
                'timestamp': datetime.now().isoformat()
            }
        )

    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/api/health")
async def health_check():
    return {
        'status': 'healthy',
        'models_loaded': True,
        'version': '1.0.0'
    }

Monitoring

import prometheus_client as prom

# Metrics
prediction_counter = prom.Counter('ml_predictions_total', 'Total predictions')
prediction_latency = prom.Histogram('ml_prediction_latency_seconds', 'Prediction latency')
model_accuracy = prom.Gauge('ml_model_accuracy', 'Model accuracy', ['model_name'])

@prediction_latency.time()
def generate_signal_monitored(data):
    prediction_counter.inc()
    signal = orchestrator.generate_signal(data)
    return signal

Retraining Pipeline

class AutoRetrainingPipeline:
    """
    Pipeline de reentrenamiento autom\u00e1tico
    """

    def __init__(self, schedule='weekly'):
        self.schedule = schedule
        self.performance_threshold = 0.70

    def should_retrain(self):
        """Determina si es necesario reentrenar"""
        # Check recent performance
        recent_accuracy = self.get_recent_accuracy()

        if recent_accuracy < self.performance_threshold:
            return True, 'Performance degradation'

        # Check data drift
        drift_detected = self.detect_data_drift()
        if drift_detected:
            return True, 'Data drift detected'

        return False, None

    def execute_retraining(self):
        """Ejecuta reentrenamiento"""
        print("Starting retraining...")

        # Fetch new data
        new_data = self.fetch_latest_data()

        # Retrain all models
        pipeline = MLTrainingPipeline(new_data, self.config)
        new_models = pipeline.run()

        # Validate new models
        if self.validate_new_models(new_models):
            # Deploy new models
            self.deploy_models(new_models)
            print("Retraining complete. New models deployed.")
        else:
            print("Validation failed. Keeping old models.")

Documento Generado: 2025-12-05 Pr\u00f3xima Revisi\u00f3n: 2025-Q1 Contacto: ml-engineering@trading.ai

58 KiB Raw Blame History

Arquitectura de Modelos ML - Trading Platform

Tabla de Contenidos

Visi\u00f3n General

Arquitectura del Sistema

Flujo de Datos

Principios de Dise\u00f1o

Modelo 1: AMDDetector

Descripci\u00f3n

Arquitectura

Input Features

Target Labeling

Output

M\u00e9tricas de Evaluaci\u00f3n

Modelo 2: RangePredictor

Descripci\u00f3n

Arquitectura

Input Features

Targets

Output

M\u00e9tricas de Evaluaci\u00f3n

Modelo 3: TPSLClassifier

Descripci\u00f3n

Arquitectura

Input Features

Targets

Probability Calibration

Output

M\u00e9tricas de Evaluaci\u00f3n

Modelo 4: LiquidityHunter

Descripci\u00f3n

Arquitectura

Input Features

Targets

Output

M\u00e9tricas

Modelo 5: OrderFlowAnalyzer

Descripci\u00f3n

Arquitectura

Input Features (Secuencia)

Output

Meta-Modelo: StrategyOrchestrator

Descripci\u00f3n

Arquitectura

Pipeline de Decisi\u00f3n

Output

Pipeline de Entrenamiento

Workflow Completo

Cross-Validation Temporal

M\u00e9tricas y Evaluaci\u00f3n

M\u00e9tricas por Modelo

Backtesting de Señales

Producci\u00f3n y Deployment

FastAPI Service

Monitoring

Retraining Pipeline

58 KiB

Raw Blame History