trading-platform/docs/02-definicion-modulos/OQI-006-ml-signals/estrategias/FEATURES-TARGETS-COMPLETO.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

35 KiB

id title type project version updated_date
FEATURES-TARGETS-COMPLETO Features y Targets Completos - Modelos ML Trading Platform Documentation trading-platform 1.0.0 2026-01-04

Features y Targets Completos - Modelos ML Trading Platform

Version: 2.0.0 Fecha: 2025-12-08 Modulo: OQI-006-ml-signals Autor: Trading Strategist - Trading Platform


Tabla de Contenidos

  1. Vision General
  2. Features por Categoria
  3. Targets por Modelo
  4. Feature Engineering Pipeline
  5. Validacion y Testing

Vision General

Resumen de Features

Categoria Cantidad Modelos que las usan
Price Action 12 AMD, Range, TPSL
Volume 10 AMD, Range, TPSL
Volatility 8 AMD, Range, TPSL
Trend 10 AMD, Range, TPSL
Market Structure 12 AMD, SMC
Order Flow 10 AMD, Liquidity
Liquidity 8 Liquidity, TPSL
ICT 15 ICT Context, Range
SMC 12 AMD, TPSL
Time 6 Todos
TOTAL 103 -

Resumen de Targets

Modelo Tipo Target Clases/Valores Horizonte
AMDDetector Multiclass 4 (neutral, acc, manip, dist) 20 bars
RangePredictor Regression delta_high, delta_low 15m, 1h
TPSLClassifier Binary 0/1 (SL first / TP first) Variable
LiquidityHunter Binary 0/1 (no sweep / sweep) 10 bars
ICTContextModel Continuous 0-1 score Current

Features por Categoria

1. Price Action Features (12)

# Feature Calculo Tipo Modelo(s)
1 range_ratio (high - low) / SMA(high - low, 20) float AMD, Range
2 range_pct (high - low) / close float AMD, Range
3 body_size abs(close - open) / (high - low + 1e-8) float AMD
4 upper_wick (high - max(close, open)) / (high - low + 1e-8) float AMD
5 lower_wick (min(close, open) - low) / (high - low + 1e-8) float AMD
6 buying_pressure (close - low) / (high - low + 1e-8) float AMD, Range
7 selling_pressure (high - close) / (high - low + 1e-8) float AMD, Range
8 close_position (close - low) / (high - low + 1e-8) float AMD
9 range_expansion range_ratio > 1.3 binary AMD
10 range_compression range_ratio < 0.7 binary AMD
11 gap_up open > high.shift(1) binary Range
12 gap_down open < low.shift(1) binary Range
def extract_price_action_features(df):
    """Extrae features de price action"""
    f = {}

    hl_range = df['high'] - df['low']
    hl_range_safe = hl_range.replace(0, 1e-8)

    f['range_ratio'] = hl_range / hl_range.rolling(20).mean()
    f['range_pct'] = hl_range / df['close']
    f['body_size'] = abs(df['close'] - df['open']) / hl_range_safe
    f['upper_wick'] = (df['high'] - df[['close', 'open']].max(axis=1)) / hl_range_safe
    f['lower_wick'] = (df[['close', 'open']].min(axis=1) - df['low']) / hl_range_safe
    f['buying_pressure'] = (df['close'] - df['low']) / hl_range_safe
    f['selling_pressure'] = (df['high'] - df['close']) / hl_range_safe
    f['close_position'] = f['buying_pressure']
    f['range_expansion'] = (f['range_ratio'] > 1.3).astype(int)
    f['range_compression'] = (f['range_ratio'] < 0.7).astype(int)
    f['gap_up'] = (df['open'] > df['high'].shift(1)).astype(int)
    f['gap_down'] = (df['open'] < df['low'].shift(1)).astype(int)

    return pd.DataFrame(f)

2. Volume Features (10)

# Feature Calculo Tipo Modelo(s)
1 volume_ratio volume / SMA(volume, 20) float AMD, Range
2 volume_trend SMA(volume, 10) - SMA(volume, 30) float AMD
3 volume_spike volume > SMA(volume, 20) * 2 binary AMD
4 obv On Balance Volume float AMD
5 obv_slope (OBV - OBV.shift(5)) / 5 float AMD
6 vwap Volume Weighted Average Price float Range
7 vwap_distance (close - vwap) / vwap float Range
8 volume_on_up sum(vol if close > open) / total_vol (20 bars) float AMD
9 volume_on_down sum(vol if close < open) / total_vol (20 bars) float AMD
10 volume_imbalance volume_on_up - volume_on_down float AMD
def extract_volume_features(df):
    """Extrae features de volumen"""
    f = {}

    f['volume_ratio'] = df['volume'] / df['volume'].rolling(20).mean()
    f['volume_trend'] = df['volume'].rolling(10).mean() - df['volume'].rolling(30).mean()
    f['volume_spike'] = (df['volume'] > df['volume'].rolling(20).mean() * 2).astype(int)

    # OBV
    obv_direction = ((df['close'] > df['close'].shift(1)).astype(int) * 2 - 1)
    f['obv'] = (df['volume'] * obv_direction).cumsum()
    f['obv_slope'] = f['obv'].diff(5) / 5

    # VWAP
    typical_price = (df['high'] + df['low'] + df['close']) / 3
    f['vwap'] = (typical_price * df['volume']).cumsum() / df['volume'].cumsum()
    f['vwap_distance'] = (df['close'] - f['vwap']) / f['vwap']

    # Volume distribution
    up_bars = (df['close'] > df['open']).astype(int)
    down_bars = (df['close'] < df['open']).astype(int)

    f['volume_on_up'] = (df['volume'] * up_bars).rolling(20).sum() / df['volume'].rolling(20).sum()
    f['volume_on_down'] = (df['volume'] * down_bars).rolling(20).sum() / df['volume'].rolling(20).sum()
    f['volume_imbalance'] = f['volume_on_up'] - f['volume_on_down']

    return pd.DataFrame(f)

3. Volatility Features (8)

# Feature Calculo Tipo Modelo(s)
1 atr Average True Range (14) float Todos
2 atr_ratio ATR / SMA(ATR, 50) float AMD, Range
3 atr_percentile Percentile de ATR (100 bars) float Range
4 volatility_10 std(returns, 10) float Range
5 volatility_20 std(returns, 20) float Range
6 volatility_50 std(returns, 50) float Range
7 volatility_ratio volatility_10 / volatility_50 float AMD
8 bollinger_width (BB_upper - BB_lower) / BB_middle float Range
def extract_volatility_features(df):
    """Extrae features de volatilidad"""
    f = {}

    # ATR
    tr = pd.concat([
        df['high'] - df['low'],
        abs(df['high'] - df['close'].shift(1)),
        abs(df['low'] - df['close'].shift(1))
    ], axis=1).max(axis=1)

    f['atr'] = tr.rolling(14).mean()
    f['atr_ratio'] = f['atr'] / f['atr'].rolling(50).mean()
    f['atr_percentile'] = f['atr'].rolling(100).apply(
        lambda x: pd.Series(x).rank(pct=True).iloc[-1]
    )

    # Returns volatility
    returns = df['close'].pct_change()
    f['volatility_10'] = returns.rolling(10).std()
    f['volatility_20'] = returns.rolling(20).std()
    f['volatility_50'] = returns.rolling(50).std()
    f['volatility_ratio'] = f['volatility_10'] / f['volatility_50']

    # Bollinger Width
    sma_20 = df['close'].rolling(20).mean()
    std_20 = df['close'].rolling(20).std()
    bb_upper = sma_20 + 2 * std_20
    bb_lower = sma_20 - 2 * std_20
    f['bollinger_width'] = (bb_upper - bb_lower) / sma_20

    return pd.DataFrame(f)

4. Trend Features (10)

# Feature Calculo Tipo Modelo(s)
1 sma_10 Simple Moving Average (10) float Range
2 sma_20 Simple Moving Average (20) float Range
3 sma_50 Simple Moving Average (50) float Range
4 close_sma_10_ratio close / SMA_10 float AMD
5 close_sma_20_ratio close / SMA_20 float AMD
6 sma_slope_20 (SMA_20 - SMA_20.shift(5)) / 5 float AMD
7 trend_strength abs(sma_slope_20) / ATR float AMD
8 adx Average Directional Index (14) float AMD
9 plus_di +DI (14) float Range
10 minus_di -DI (14) float Range
def extract_trend_features(df):
    """Extrae features de tendencia"""
    f = {}

    # SMAs
    f['sma_10'] = df['close'].rolling(10).mean()
    f['sma_20'] = df['close'].rolling(20).mean()
    f['sma_50'] = df['close'].rolling(50).mean()

    f['close_sma_10_ratio'] = df['close'] / f['sma_10']
    f['close_sma_20_ratio'] = df['close'] / f['sma_20']
    f['sma_slope_20'] = f['sma_20'].diff(5) / 5

    # Trend strength
    atr = calculate_atr(df, 14)
    f['trend_strength'] = abs(f['sma_slope_20']) / atr

    # ADX calculation
    f['adx'], f['plus_di'], f['minus_di'] = calculate_adx(df, 14)

    return pd.DataFrame(f)

def calculate_adx(df, period=14):
    """Calcula ADX, +DI, -DI"""
    plus_dm = df['high'].diff()
    minus_dm = -df['low'].diff()

    plus_dm = plus_dm.where((plus_dm > minus_dm) & (plus_dm > 0), 0)
    minus_dm = minus_dm.where((minus_dm > plus_dm) & (minus_dm > 0), 0)

    tr = pd.concat([
        df['high'] - df['low'],
        abs(df['high'] - df['close'].shift(1)),
        abs(df['low'] - df['close'].shift(1))
    ], axis=1).max(axis=1)

    atr = tr.rolling(period).mean()
    plus_di = 100 * (plus_dm.rolling(period).mean() / atr)
    minus_di = 100 * (minus_dm.rolling(period).mean() / atr)

    dx = 100 * abs(plus_di - minus_di) / (plus_di + minus_di)
    adx = dx.rolling(period).mean()

    return adx, plus_di, minus_di

5. Market Structure Features (12)

# Feature Calculo Tipo Modelo(s)
1 higher_highs_count Count HH en 20 bars int AMD
2 higher_lows_count Count HL en 20 bars int AMD
3 lower_highs_count Count LH en 20 bars int AMD
4 lower_lows_count Count LL en 20 bars int AMD
5 swing_high_distance Distancia al swing high mas cercano float Liquidity
6 swing_low_distance Distancia al swing low mas cercano float Liquidity
7 bos_bullish_count Count BOS alcista en 30 bars int SMC
8 bos_bearish_count Count BOS bajista en 30 bars int SMC
9 choch_bullish_count Count CHOCH alcista en 30 bars int SMC
10 choch_bearish_count Count CHOCH bajista en 30 bars int SMC
11 structure_score Score de estructura (-1 a +1) float AMD
12 structure_alignment Alineacion con tendencia binary Range
def extract_market_structure_features(df, lookback=20):
    """Extrae features de estructura de mercado"""
    f = {}

    # Higher highs/lows, Lower highs/lows
    f['higher_highs_count'] = (df['high'] > df['high'].shift(1)).rolling(lookback).sum()
    f['higher_lows_count'] = (df['low'] > df['low'].shift(1)).rolling(lookback).sum()
    f['lower_highs_count'] = (df['high'] < df['high'].shift(1)).rolling(lookback).sum()
    f['lower_lows_count'] = (df['low'] < df['low'].shift(1)).rolling(lookback).sum()

    # Swing distances
    swing_highs = detect_swing_points(df, 'high', lookback)
    swing_lows = detect_swing_points(df, 'low', lookback)

    f['swing_high_distance'] = calculate_distance_to_nearest(df['close'], swing_highs, 'above')
    f['swing_low_distance'] = calculate_distance_to_nearest(df['close'], swing_lows, 'below')

    # BOS and CHOCH counts
    bos_signals = detect_bos(df, lookback)
    choch_signals = detect_choch(df, lookback)

    f['bos_bullish_count'] = count_signals(bos_signals, 'bullish', 30)
    f['bos_bearish_count'] = count_signals(bos_signals, 'bearish', 30)
    f['choch_bullish_count'] = count_signals(choch_signals, 'bullish', 30)
    f['choch_bearish_count'] = count_signals(choch_signals, 'bearish', 30)

    # Structure score
    bullish_points = f['higher_highs_count'] + f['higher_lows_count']
    bearish_points = f['lower_highs_count'] + f['lower_lows_count']
    total_points = bullish_points + bearish_points + 1e-8
    f['structure_score'] = (bullish_points - bearish_points) / total_points

    # Structure alignment
    trend_direction = np.sign(df['close'].rolling(20).mean().diff(5))
    f['structure_alignment'] = (np.sign(f['structure_score']) == trend_direction).astype(int)

    return pd.DataFrame(f)

6. Order Flow Features (10)

# Feature Calculo Tipo Modelo(s)
1 order_blocks_bullish Count OB bullish en 30 bars int AMD
2 order_blocks_bearish Count OB bearish en 30 bars int AMD
3 ob_net OB_bullish - OB_bearish int AMD
4 fvg_bullish_count Count FVG bullish sin rellenar int Range
5 fvg_bearish_count Count FVG bearish sin rellenar int Range
6 fvg_nearest_distance Distancia a FVG mas cercano float Range
7 false_breakout_count Count falsas rupturas en 30 bars int AMD
8 whipsaw_intensity Frecuencia de reversiones rapidas float AMD
9 reversal_count Count reversiones en 20 bars int AMD
10 displacement_strength Fuerza del ultimo displacement float SMC
def extract_order_flow_features(df, lookback=30):
    """Extrae features de order flow"""
    f = {}

    # Order blocks
    ob_bullish = identify_order_blocks(df, 'bullish')
    ob_bearish = identify_order_blocks(df, 'bearish')

    f['order_blocks_bullish'] = count_recent(ob_bullish, lookback)
    f['order_blocks_bearish'] = count_recent(ob_bearish, lookback)
    f['ob_net'] = f['order_blocks_bullish'] - f['order_blocks_bearish']

    # Fair Value Gaps
    fvg_bullish = identify_fvg(df, 'bullish')
    fvg_bearish = identify_fvg(df, 'bearish')

    f['fvg_bullish_count'] = count_unfilled_fvg(fvg_bullish, df['close'])
    f['fvg_bearish_count'] = count_unfilled_fvg(fvg_bearish, df['close'])
    f['fvg_nearest_distance'] = calculate_nearest_fvg_distance(df, fvg_bullish + fvg_bearish)

    # False breakouts and whipsaws
    f['false_breakout_count'] = count_false_breakouts(df, lookback)
    f['whipsaw_intensity'] = calculate_whipsaw_intensity(df, lookback)

    # Reversals
    price_changes = df['close'].pct_change()
    reversals = ((price_changes > 0.005) & (price_changes.shift(-1) < -0.005)) | \
                ((price_changes < -0.005) & (price_changes.shift(-1) > 0.005))
    f['reversal_count'] = reversals.rolling(20).sum()

    # Displacement
    f['displacement_strength'] = calculate_displacement_strength(df)

    return pd.DataFrame(f)

7. Liquidity Features (8)

# Feature Calculo Tipo Modelo(s)
1 bsl_distance Distancia a Buy Side Liquidity float Liquidity
2 ssl_distance Distancia a Sell Side Liquidity float Liquidity
3 bsl_strength Numero de stops acumulados arriba int Liquidity
4 ssl_strength Numero de stops acumulados abajo int Liquidity
5 liquidity_grab_count Count grabs recientes (20 bars) int AMD
6 time_since_bsl_sweep Bars desde ultimo BSL sweep int Liquidity
7 time_since_ssl_sweep Bars desde ultimo SSL sweep int Liquidity
8 liquidity_imbalance (BSL_strength - SSL_strength) / total float Liquidity
def extract_liquidity_features(df, lookback=20):
    """Extrae features de liquidez"""
    f = {}

    # Identify liquidity pools
    swing_highs = df['high'].rolling(lookback, center=True).max()
    swing_lows = df['low'].rolling(lookback, center=True).min()

    # Distances to liquidity
    f['bsl_distance'] = (swing_highs - df['close']) / df['close']
    f['ssl_distance'] = (df['close'] - swing_lows) / df['close']

    # Liquidity strength (number of swing points)
    f['bsl_strength'] = count_swing_points_above(df, lookback)
    f['ssl_strength'] = count_swing_points_below(df, lookback)

    # Liquidity grabs
    f['liquidity_grab_count'] = count_liquidity_grabs(df, lookback)

    # Time since sweeps
    bsl_sweeps = detect_bsl_sweeps(df)
    ssl_sweeps = detect_ssl_sweeps(df)

    f['time_since_bsl_sweep'] = bars_since_last(bsl_sweeps)
    f['time_since_ssl_sweep'] = bars_since_last(ssl_sweeps)

    # Liquidity imbalance
    total_liquidity = f['bsl_strength'] + f['ssl_strength'] + 1e-8
    f['liquidity_imbalance'] = (f['bsl_strength'] - f['ssl_strength']) / total_liquidity

    return pd.DataFrame(f)

8. ICT Features (15)

# Feature Calculo Tipo Modelo(s)
1 ote_position Posicion en Fibonacci (0-1) float ICT
2 in_discount_zone Precio en 21-38% Fib binary ICT
3 in_premium_zone Precio en 62-79% Fib binary ICT
4 in_ote_buy_zone Zona optima compra (62-79%) binary ICT
5 in_ote_sell_zone Zona optima venta (21-38%) binary ICT
6 is_london_kz En London Open Killzone binary ICT
7 is_ny_kz En NY AM Killzone binary ICT
8 is_asian_kz En Asian Killzone binary ICT
9 killzone_strength Fuerza de la sesion (0-1) float ICT
10 session_overlap En overlap London/NY binary ICT
11 weekly_range_position Posicion en rango semanal float ICT
12 daily_range_position Posicion en rango diario float ICT
13 mmsm_detected Market Maker Sell Model binary ICT
14 mmbm_detected Market Maker Buy Model binary ICT
15 po3_phase Power of 3 phase (1-3) int ICT
def extract_ict_features(df, timestamps):
    """Extrae features ICT"""
    f = {}

    # OTE zones
    swing_high = df['high'].rolling(50).max()
    swing_low = df['low'].rolling(50).min()
    range_size = swing_high - swing_low

    f['ote_position'] = (df['close'] - swing_low) / range_size
    f['in_discount_zone'] = ((f['ote_position'] >= 0.21) & (f['ote_position'] <= 0.38)).astype(int)
    f['in_premium_zone'] = ((f['ote_position'] >= 0.62) & (f['ote_position'] <= 0.79)).astype(int)
    f['in_ote_buy_zone'] = f['in_discount_zone']
    f['in_ote_sell_zone'] = f['in_premium_zone']

    # Killzones
    killzones = identify_killzones(timestamps)
    f['is_london_kz'] = (killzones == 'london_open').astype(int)
    f['is_ny_kz'] = (killzones == 'ny_am').astype(int)
    f['is_asian_kz'] = (killzones == 'asian').astype(int)

    f['killzone_strength'] = get_killzone_strength(killzones)
    f['session_overlap'] = ((killzones == 'london_close') | (killzones == 'ny_am')).astype(int)

    # Range positions
    f['weekly_range_position'] = calculate_weekly_position(df)
    f['daily_range_position'] = calculate_daily_position(df)

    # Market Maker Models
    f['mmsm_detected'] = detect_mmsm(df)
    f['mmbm_detected'] = detect_mmbm(df)

    # Power of 3
    f['po3_phase'] = calculate_po3_phase(df, timestamps)

    return pd.DataFrame(f)

9. SMC Features (12)

# Feature Calculo Tipo Modelo(s)
1 choch_bullish_recent CHOCH bullish en 30 bars binary SMC
2 choch_bearish_recent CHOCH bearish en 30 bars binary SMC
3 bos_bullish_recent BOS bullish en 30 bars binary SMC
4 bos_bearish_recent BOS bearish en 30 bars binary SMC
5 inducement_bullish Inducement bullish detectado binary SMC
6 inducement_bearish Inducement bearish detectado binary SMC
7 displacement_bullish Displacement bullish reciente binary SMC
8 displacement_bearish Displacement bearish reciente binary SMC
9 liquidity_void_distance Distancia a void mas cercano float SMC
10 structure_bullish_score Score estructura alcista float SMC
11 structure_bearish_score Score estructura bajista float SMC
12 smc_confluence_score Score de confluence SMC float SMC
def extract_smc_features(df, lookback=30):
    """Extrae features SMC"""
    f = {}

    # CHOCH
    choch_signals = detect_choch(df, lookback)
    f['choch_bullish_recent'] = has_recent_signal(choch_signals, 'bullish', 30)
    f['choch_bearish_recent'] = has_recent_signal(choch_signals, 'bearish', 30)

    # BOS
    bos_signals = detect_bos(df, lookback)
    f['bos_bullish_recent'] = has_recent_signal(bos_signals, 'bullish', 30)
    f['bos_bearish_recent'] = has_recent_signal(bos_signals, 'bearish', 30)

    # Inducement
    inducements = detect_inducement(df)
    f['inducement_bullish'] = has_recent_signal(inducements, 'bullish', 20)
    f['inducement_bearish'] = has_recent_signal(inducements, 'bearish', 20)

    # Displacement
    displacements = detect_displacement(df)
    f['displacement_bullish'] = has_recent_signal(displacements, 'bullish', 10)
    f['displacement_bearish'] = has_recent_signal(displacements, 'bearish', 10)

    # Liquidity voids
    voids = detect_liquidity_voids(df)
    f['liquidity_void_distance'] = calculate_nearest_void_distance(df['close'], voids)

    # Structure scores
    f['structure_bullish_score'] = calculate_bullish_structure_score(df)
    f['structure_bearish_score'] = calculate_bearish_structure_score(df)

    # SMC Confluence
    f['smc_confluence_score'] = calculate_smc_confluence(f)

    return pd.DataFrame(f)

10. Time Features (6)

# Feature Calculo Tipo Modelo(s)
1 hour_sin sin(2 * pi * hour / 24) float Todos
2 hour_cos cos(2 * pi * hour / 24) float Todos
3 day_of_week Dia de la semana (0-6) int Todos
4 is_weekend Sabado o Domingo binary Todos
5 time_in_session Minutos desde inicio sesion int ICT
6 minutes_to_close Minutos hasta cierre sesion int ICT
def extract_time_features(timestamps):
    """Extrae features temporales"""
    f = {}

    hours = timestamps.hour
    f['hour_sin'] = np.sin(2 * np.pi * hours / 24)
    f['hour_cos'] = np.cos(2 * np.pi * hours / 24)
    f['day_of_week'] = timestamps.dayofweek
    f['is_weekend'] = (timestamps.dayofweek >= 5).astype(int)

    # Session timing
    f['time_in_session'] = calculate_time_in_session(timestamps)
    f['minutes_to_close'] = calculate_minutes_to_close(timestamps)

    return pd.DataFrame(f)

Targets por Modelo

1. AMDDetector Target

Tipo: Multiclass Classification (4 clases)

Clase Valor Descripcion
Neutral 0 Sin fase clara definida
Accumulation 1 Fase de acumulacion
Manipulation 2 Fase de manipulacion
Distribution 3 Fase de distribucion

Metodo de Labeling:

def label_amd_phase(df, i, forward_window=20):
    """
    Etiqueta la fase AMD basada en comportamiento futuro

    Criterios:
    - Accumulation: Rango estrecho + precio sube despues
    - Manipulation: Falsas rupturas + whipsaws
    - Distribution: Volumen en caidas + precio baja despues
    - Neutral: No cumple ninguno claramente
    """
    if i + forward_window >= len(df):
        return 0  # neutral

    future = df.iloc[i:i+forward_window]
    current_price = df['close'].iloc[i]

    # Metricas del futuro
    price_range_pct = (future['high'].max() - future['low'].min()) / current_price
    final_price = future['close'].iloc[-1]
    price_change = (final_price - current_price) / current_price

    # Volumen
    volume_first_half = future['volume'].iloc[:10].mean()
    volume_second_half = future['volume'].iloc[10:].mean()

    # False breakouts
    false_breaks = count_false_breakouts_forward(df, i, forward_window)

    # ACCUMULATION criteria
    if price_range_pct < 0.02:  # Rango < 2%
        if price_change > 0.01:  # Sube > 1% despues
            if volume_second_half < volume_first_half:  # Volumen decreciente
                return 1  # accumulation

    # MANIPULATION criteria
    if false_breaks >= 2:  # 2+ falsas rupturas
        whipsaw_count = count_whipsaws_forward(df, i, forward_window)
        if whipsaw_count >= 3:
            return 2  # manipulation

    # DISTRIBUTION criteria
    if price_change < -0.015:  # Cae > 1.5%
        # Volumen alto en caidas
        down_volume = calculate_volume_on_down_moves(future)
        if down_volume > 0.6:  # 60%+ volumen en caidas
            return 3  # distribution

    return 0  # neutral

def count_false_breakouts_forward(df, i, window):
    """Cuenta falsas rupturas en ventana futura"""
    future = df.iloc[i:i+window]
    resistance = df['high'].iloc[max(0,i-20):i].max()
    support = df['low'].iloc[max(0,i-20):i].min()

    false_breaks = 0
    for j in range(1, len(future)):
        # False breakout above
        if future['high'].iloc[j] > resistance * 1.005:
            if future['close'].iloc[j] < resistance:
                false_breaks += 1
        # False breakdown below
        if future['low'].iloc[j] < support * 0.995:
            if future['close'].iloc[j] > support:
                false_breaks += 1

    return false_breaks

Balance de Clases Esperado:

  • Neutral: ~40%
  • Accumulation: ~20%
  • Manipulation: ~20%
  • Distribution: ~20%

2. RangePredictor Target

Tipo: Regression (continuo) + Binned Classification

Targets de Regresion:

Target Calculo Horizonte
delta_high_15m (max_high_3bars - close) / close 15 min
delta_low_15m (close - min_low_3bars) / close 15 min
delta_high_1h (max_high_12bars - close) / close 1 hora
delta_low_1h (close - min_low_12bars) / close 1 hora

Targets Binned:

Bin Rango (ATR multiple) Descripcion
0 < 0.3 ATR Muy bajo
1 0.3 - 0.7 ATR Bajo
2 0.7 - 1.2 ATR Medio
3 > 1.2 ATR Alto
def calculate_range_targets(df, horizons={'15m': 3, '1h': 12}):
    """
    Calcula targets para RangePredictor
    """
    targets = {}
    atr = calculate_atr(df, 14)

    for name, periods in horizons.items():
        # Regression targets
        future_high = df['high'].rolling(periods).max().shift(-periods)
        future_low = df['low'].rolling(periods).min().shift(-periods)

        targets[f'delta_high_{name}'] = (future_high - df['close']) / df['close']
        targets[f'delta_low_{name}'] = (df['close'] - future_low) / df['close']

        # Binned targets
        for target_type in ['high', 'low']:
            delta = targets[f'delta_{target_type}_{name}']
            atr_ratio = delta / (atr / df['close'])

            bins = pd.cut(
                atr_ratio,
                bins=[-np.inf, 0.3, 0.7, 1.2, np.inf],
                labels=[0, 1, 2, 3]
            )
            targets[f'bin_{target_type}_{name}'] = bins

    return pd.DataFrame(targets)

3. TPSLClassifier Target

Tipo: Binary Classification

Valor Descripcion
0 Stop Loss toca primero
1 Take Profit toca primero

Configuraciones R:R:

Config SL Distance TP Distance R:R
rr_2_1 0.3 ATR 0.6 ATR 2:1
rr_3_1 0.3 ATR 0.9 ATR 3:1
rr_4_1 0.25 ATR 1.0 ATR 4:1
def calculate_tpsl_targets(df, horizons, rr_configs):
    """
    Calcula targets para TPSLClassifier

    Returns 1 si TP toca primero, 0 si SL toca primero, NaN si ninguno
    """
    targets = {}
    atr = calculate_atr(df, 14)

    for horizon_name, max_bars in horizons.items():
        for rr in rr_configs:
            target_name = f'tp_first_{horizon_name}_{rr["name"]}'

            sl_distance = atr * rr['sl_atr_multiple']
            tp_distance = atr * rr['tp_atr_multiple']

            results = []
            for i in range(len(df)):
                if i + max_bars >= len(df):
                    results.append(np.nan)
                    continue

                entry_price = df['close'].iloc[i]
                sl_price = entry_price - sl_distance.iloc[i]
                tp_price = entry_price + tp_distance.iloc[i]

                # Simular hacia adelante
                result = simulate_trade_outcome(
                    df.iloc[i+1:i+max_bars+1],
                    entry_price,
                    sl_price,
                    tp_price
                )
                results.append(result)

            targets[target_name] = results

    return pd.DataFrame(targets)

def simulate_trade_outcome(future_bars, entry, sl, tp):
    """
    Simula resultado del trade
    Returns: 1 (TP first), 0 (SL first), NaN (neither)
    """
    for _, row in future_bars.iterrows():
        # Check SL first (assuming worst case)
        if row['low'] <= sl:
            return 0
        # Check TP
        if row['high'] >= tp:
            return 1

    return np.nan  # Neither hit within window

4. LiquidityHunter Target

Tipo: Binary Classification

Valor Descripcion
0 No hay liquidity sweep
1 Hay liquidity sweep

Tipos de Sweep:

Target Descripcion
bsl_sweep Sweep de Buy Side Liquidity
ssl_sweep Sweep de Sell Side Liquidity
any_sweep Cualquier sweep
def calculate_liquidity_targets(df, forward_window=10, sweep_threshold=0.005):
    """
    Calcula targets para LiquidityHunter
    """
    targets = {}

    for i in range(len(df) - forward_window):
        # Current liquidity levels
        lookback = df.iloc[max(0, i-20):i]
        swing_high = lookback['high'].max()
        swing_low = lookback['low'].min()

        # Future price action
        future = df.iloc[i:i+forward_window]

        # BSL sweep (price goes above swing high then reverses)
        bsl_level = swing_high * (1 + sweep_threshold)
        bsl_swept = (future['high'] >= bsl_level).any()
        bsl_reversed = bsl_swept and (future['close'].iloc[-1] < swing_high)

        # SSL sweep (price goes below swing low then reverses)
        ssl_level = swing_low * (1 - sweep_threshold)
        ssl_swept = (future['low'] <= ssl_level).any()
        ssl_reversed = ssl_swept and (future['close'].iloc[-1] > swing_low)

        targets.setdefault('bsl_sweep', []).append(1 if bsl_reversed else 0)
        targets.setdefault('ssl_sweep', []).append(1 if ssl_reversed else 0)
        targets.setdefault('any_sweep', []).append(1 if (bsl_reversed or ssl_reversed) else 0)

    # Padding para ultimos indices
    for key in targets:
        targets[key].extend([np.nan] * forward_window)

    return pd.DataFrame(targets)

5. ICTContextModel Target

Tipo: Continuous Score (0-1)

Este modelo no tiene un target tradicional sino que calcula un score en tiempo real basado en contexto ICT.

def calculate_ict_context_score(df, timestamps):
    """
    Calcula score de contexto ICT (0-1)

    Factores:
    - Killzone strength (40%)
    - OTE position alignment (30%)
    - Range position (20%)
    - MM model detection (10%)
    """
    score = 0.0

    # Killzone
    killzone = identify_killzone(timestamps.iloc[-1])
    kz_strength = get_killzone_strength(killzone)
    score += 0.40 * kz_strength

    # OTE alignment
    ote_pos = calculate_ote_position(df)
    if ote_pos < 0.38:  # Discount
        ote_alignment = 0.38 - ote_pos  # Better if lower
    elif ote_pos > 0.62:  # Premium
        ote_alignment = ote_pos - 0.62  # Better if higher
    else:
        ote_alignment = 0  # Near equilibrium
    score += 0.30 * min(ote_alignment * 3, 1.0)

    # Range position
    daily_pos = calculate_daily_range_position(df)
    range_score = abs(daily_pos - 0.5) * 2  # Better at extremes
    score += 0.20 * range_score

    # MM model
    mm_model = detect_market_maker_model(df)
    if mm_model['model'] != 'none':
        score += 0.10 * mm_model['confidence']

    return score

Feature Engineering Pipeline

Pipeline Completo

class FeatureEngineeringPipeline:
    """Pipeline completo de feature engineering"""

    def __init__(self, config=None):
        self.config = config or self._default_config()
        self.scaler = RobustScaler()
        self.feature_names = []

    def fit_transform(self, df, timestamps=None):
        """Extrae y normaliza todas las features"""

        # 1. Extract all feature groups
        price_features = extract_price_action_features(df)
        volume_features = extract_volume_features(df)
        volatility_features = extract_volatility_features(df)
        trend_features = extract_trend_features(df)
        structure_features = extract_market_structure_features(df)
        order_flow_features = extract_order_flow_features(df)
        liquidity_features = extract_liquidity_features(df)

        if timestamps is not None:
            ict_features = extract_ict_features(df, timestamps)
            time_features = extract_time_features(timestamps)
        else:
            ict_features = pd.DataFrame()
            time_features = pd.DataFrame()

        smc_features = extract_smc_features(df)

        # 2. Combine all features
        all_features = pd.concat([
            price_features,
            volume_features,
            volatility_features,
            trend_features,
            structure_features,
            order_flow_features,
            liquidity_features,
            ict_features,
            smc_features,
            time_features
        ], axis=1)

        # 3. Handle NaN
        all_features = all_features.fillna(method='ffill').fillna(0)

        # 4. Store feature names
        self.feature_names = all_features.columns.tolist()

        # 5. Scale features
        scaled_features = self.scaler.fit_transform(all_features)

        return scaled_features

    def transform(self, df, timestamps=None):
        """Transform solo (usa scaler ya ajustado)"""
        # ... same extraction ...
        return self.scaler.transform(all_features)

    def get_feature_importance(self, model, top_n=20):
        """Obtiene importancia de features"""
        importance = pd.DataFrame({
            'feature': self.feature_names,
            'importance': model.feature_importances_
        }).sort_values('importance', ascending=False)

        return importance.head(top_n)

Validacion y Testing

Metricas por Modelo

Modelo Metrica Principal Target Metricas Secundarias
AMDDetector Accuracy >70% F1 macro >0.65, Precision por clase >60%
RangePredictor MAE <0.003 R2 >0.3, Directional Acc >90%
TPSLClassifier AUC >0.85 Accuracy >80%, Precision >75%
LiquidityHunter Precision >70% Recall >60%, F1 >0.65
ICTContextModel - - Validacion cualitativa

Validacion Temporal

def temporal_validation(model, X, y, n_splits=5):
    """
    Validacion respetando orden temporal
    """
    tscv = TimeSeriesSplit(n_splits=n_splits)
    scores = []

    for fold, (train_idx, val_idx) in enumerate(tscv.split(X)):
        X_train, X_val = X[train_idx], X[val_idx]
        y_train, y_val = y[train_idx], y[val_idx]

        model.fit(X_train, y_train)
        y_pred = model.predict(X_val)

        score = calculate_metrics(y_val, y_pred)
        scores.append(score)

    return np.mean(scores), np.std(scores)

Documento Generado: 2025-12-08 Trading Strategist - Trading Platform