trading-platform/docs/01-arquitectura/INTEGRACION-API-MASSIVE.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

1169 lines
39 KiB
Markdown

---
id: "INTEGRACION-API-MASSIVE"
title: "Integracion API Massive - Pipeline de Datos"
type: "Documentation"
project: "trading-platform"
version: "1.0.0"
updated_date: "2026-01-04"
---
# Integracion API Massive - Pipeline de Datos
**Version:** 1.0.0
**Fecha:** 2025-12-08
**Modulo:** Data Services
**Autor:** Trading Strategist - Trading Platform
---
## Tabla de Contenidos
1. [Vision General](#vision-general)
2. [API Massive Overview](#api-massive-overview)
3. [Arquitectura de Datos](#arquitectura-de-datos)
4. [Pipeline de Ingesta](#pipeline-de-ingesta)
5. [Gap Detection y Filling](#gap-detection-y-filling)
6. [Sincronizacion con MT4](#sincronizacion-con-mt4)
7. [API Endpoints](#api-endpoints)
8. [Implementacion](#implementacion)
9. [Scheduling](#scheduling)
---
## Vision General
### Objetivo
Implementar un pipeline de datos robusto que:
1. **Descargue datos historicos** desde API Massive
2. **Actualice datos** de forma incremental
3. **Detecte y rellene gaps** en los datos
4. **Sincronice** con precios de brokers MT4
5. **Mantenga calidad** de datos para ML
### Simbolos Soportados
| Simbolo | Descripcion | Datos Disponibles |
|---------|-------------|-------------------|
| XAUUSD | Oro vs USD | 10+ anos |
| EURUSD | Euro vs USD | 10+ anos |
| GBPUSD | Libra vs USD | 10+ anos |
| USDJPY | USD vs Yen | 10+ anos |
### Timeframes
| Timeframe | Codigo | Barras/Dia |
|-----------|--------|------------|
| 1 minuto | M1 | 1,440 |
| 5 minutos | M5 | 288 |
| 15 minutos | M15 | 96 |
| 1 hora | H1 | 24 |
| 4 horas | H4 | 6 |
| Diario | D1 | 1 |
---
## API Massive Overview
### Autenticacion
```python
# API Massive uses API key authentication
headers = {
"Authorization": f"Bearer {API_MASSIVE_KEY}",
"Content-Type": "application/json"
}
```
### Endpoints Principales
| Endpoint | Metodo | Descripcion |
|----------|--------|-------------|
| `/api/v1/symbols` | GET | Lista simbolos disponibles |
| `/api/v1/candles/{symbol}` | GET | Obtiene datos OHLCV |
| `/api/v1/candles/batch` | POST | Descarga batch de datos |
| `/api/v1/tick/{symbol}` | GET | Datos tick-by-tick |
### Rate Limits
| Plan | Requests/min | Candles/request | Max Historical |
|------|--------------|-----------------|----------------|
| Free | 10 | 1,000 | 1 year |
| Basic | 60 | 5,000 | 5 years |
| Pro | 300 | 50,000 | Unlimited |
---
## Arquitectura de Datos
### Diagrama de Flujo
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ DATA PIPELINE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ DATA SOURCES │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ API Massive │ │ MetaTrader4 │ │ Backup │ │ │
│ │ │ (Historical)│ │ (Live) │ │ (S3) │ │ │
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │
│ │ │ │ │ │ │
│ └─────────┼─────────────────┼─────────────────┼───────────────────────┘ │
│ │ │ │ │
│ └────────────────┬┴─────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ DATA PROCESSOR │ │
│ │ │ │
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ │
│ │ │ Fetcher │ │ Validator │ │ Normalizer │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - Download │─▶│ - Schema │─▶│ - Timezone │ │ │
│ │ │ - Retry │ │ - Range │ │ - Decimals │ │ │
│ │ │ - Rate limit │ │ - Outliers │ │ - Format │ │ │
│ │ └───────────────┘ └───────────────┘ └───────┬───────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌───────────────────────────────────────────────────────────────┐ │ │
│ │ │ GAP HANDLER │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ Detector │─▶│ Filler │─▶│ Merger │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
│ │ │ │ │ │
│ │ └───────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ └─────────────────────────────────────────────────┼───────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ STORAGE │ │
│ │ │ │
│ │ ┌────────────────────────────────────────────────────────────────┐ │ │
│ │ │ PostgreSQL │ │ │
│ │ │ │ │ │
│ │ │ Tables: │ │ │
│ │ │ - market_data (partitioned by symbol, month) │ │ │
│ │ │ - data_gaps (tracking gaps) │ │ │
│ │ │ - sync_status (last sync per symbol) │ │ │
│ │ │ │ │ │
│ │ │ Indices: │ │ │
│ │ │ - (symbol, timeframe, timestamp) - unique │ │ │
│ │ │ - (symbol, timestamp) - for range queries │ │ │
│ │ │ │ │ │
│ │ └────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Schema de Base de Datos
```sql
-- Market data table (partitioned)
CREATE TABLE market_data (
id BIGSERIAL,
symbol VARCHAR(20) NOT NULL,
timeframe VARCHAR(5) NOT NULL,
timestamp TIMESTAMPTZ NOT NULL,
open DECIMAL(18, 8) NOT NULL,
high DECIMAL(18, 8) NOT NULL,
low DECIMAL(18, 8) NOT NULL,
close DECIMAL(18, 8) NOT NULL,
volume DECIMAL(18, 8),
source VARCHAR(20) DEFAULT 'api_massive',
created_at TIMESTAMPTZ DEFAULT NOW(),
PRIMARY KEY (symbol, timeframe, timestamp)
) PARTITION BY LIST (symbol);
-- Create partitions per symbol
CREATE TABLE market_data_xauusd PARTITION OF market_data FOR VALUES IN ('XAUUSD');
CREATE TABLE market_data_eurusd PARTITION OF market_data FOR VALUES IN ('EURUSD');
CREATE TABLE market_data_gbpusd PARTITION OF market_data FOR VALUES IN ('GBPUSD');
CREATE TABLE market_data_usdjpy PARTITION OF market_data FOR VALUES IN ('USDJPY');
-- Index for time range queries
CREATE INDEX idx_market_data_symbol_time ON market_data (symbol, timestamp DESC);
-- Sync status table
CREATE TABLE sync_status (
id SERIAL PRIMARY KEY,
symbol VARCHAR(20) NOT NULL,
timeframe VARCHAR(5) NOT NULL,
last_sync_time TIMESTAMPTZ,
last_candle_time TIMESTAMPTZ,
total_candles BIGINT DEFAULT 0,
status VARCHAR(20) DEFAULT 'pending',
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(symbol, timeframe)
);
-- Data gaps tracking
CREATE TABLE data_gaps (
id SERIAL PRIMARY KEY,
symbol VARCHAR(20) NOT NULL,
timeframe VARCHAR(5) NOT NULL,
gap_start TIMESTAMPTZ NOT NULL,
gap_end TIMESTAMPTZ NOT NULL,
candles_missing INT,
status VARCHAR(20) DEFAULT 'detected', -- detected, filling, filled, unfillable
created_at TIMESTAMPTZ DEFAULT NOW(),
filled_at TIMESTAMPTZ,
UNIQUE(symbol, timeframe, gap_start)
);
```
---
## Pipeline de Ingesta
### API Massive Client
```python
# services/api_massive_client.py
import httpx
import asyncio
from typing import List, Dict, Optional
from datetime import datetime, timedelta
from dataclasses import dataclass
@dataclass
class Candle:
timestamp: datetime
open: float
high: float
low: float
close: float
volume: float
class APIMassiveClient:
"""
Cliente para API Massive
"""
def __init__(self, config: Dict):
self.base_url = config['base_url']
self.api_key = config['api_key']
self.rate_limit = config.get('rate_limit', 60) # requests/min
self.batch_size = config.get('batch_size', 5000)
self._request_count = 0
self._last_reset = datetime.now()
async def get_candles(
self,
symbol: str,
timeframe: str,
start: datetime,
end: datetime
) -> List[Candle]:
"""
Obtiene candles para un rango de tiempo
Args:
symbol: Par de trading (XAUUSD, etc.)
timeframe: Timeframe (M5, H1, etc.)
start: Fecha de inicio
end: Fecha de fin
Returns:
Lista de Candle objects
"""
await self._check_rate_limit()
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.base_url}/api/v1/candles/{symbol}",
headers={"Authorization": f"Bearer {self.api_key}"},
params={
"timeframe": timeframe,
"start": start.isoformat(),
"end": end.isoformat(),
"limit": self.batch_size
},
timeout=60.0
)
if response.status_code != 200:
raise Exception(f"API Error: {response.status_code} - {response.text}")
data = response.json()
return [
Candle(
timestamp=datetime.fromisoformat(c['timestamp']),
open=float(c['open']),
high=float(c['high']),
low=float(c['low']),
close=float(c['close']),
volume=float(c.get('volume', 0))
)
for c in data['candles']
]
async def get_candles_batch(
self,
symbol: str,
timeframe: str,
start: datetime,
end: datetime
) -> List[Candle]:
"""
Descarga candles en batches para rangos largos
"""
all_candles = []
current_start = start
while current_start < end:
# Calculate batch end
candles_per_day = self._get_candles_per_day(timeframe)
days_per_batch = self.batch_size // candles_per_day
batch_end = min(current_start + timedelta(days=days_per_batch), end)
# Fetch batch
candles = await self.get_candles(symbol, timeframe, current_start, batch_end)
all_candles.extend(candles)
# Move to next batch
if candles:
current_start = candles[-1].timestamp + self._get_timeframe_delta(timeframe)
else:
current_start = batch_end
# Progress logging
print(f" Downloaded {len(all_candles)} candles up to {current_start}")
return all_candles
async def _check_rate_limit(self):
"""Verifica y espera si es necesario por rate limit"""
now = datetime.now()
# Reset counter cada minuto
if (now - self._last_reset).seconds >= 60:
self._request_count = 0
self._last_reset = now
# Check limit
if self._request_count >= self.rate_limit:
sleep_time = 60 - (now - self._last_reset).seconds
print(f"Rate limit reached. Sleeping {sleep_time}s...")
await asyncio.sleep(sleep_time)
self._request_count = 0
self._last_reset = datetime.now()
self._request_count += 1
def _get_candles_per_day(self, timeframe: str) -> int:
"""Candles por dia segun timeframe"""
mapping = {
'M1': 1440, 'M5': 288, 'M15': 96, 'M30': 48,
'H1': 24, 'H4': 6, 'D1': 1
}
return mapping.get(timeframe, 288)
def _get_timeframe_delta(self, timeframe: str) -> timedelta:
"""Delta de tiempo para un timeframe"""
mapping = {
'M1': timedelta(minutes=1),
'M5': timedelta(minutes=5),
'M15': timedelta(minutes=15),
'M30': timedelta(minutes=30),
'H1': timedelta(hours=1),
'H4': timedelta(hours=4),
'D1': timedelta(days=1)
}
return mapping.get(timeframe, timedelta(minutes=5))
```
### Data Fetcher Service
```python
# services/data_fetcher.py
from typing import List, Dict, Optional
from datetime import datetime, timedelta
import asyncpg
class DataFetcher:
"""
Servicio de descarga y almacenamiento de datos
"""
def __init__(self, api_client: APIMassiveClient, db_pool: asyncpg.Pool, config: Dict):
self.api = api_client
self.db = db_pool
self.config = config
async def full_sync(
self,
symbol: str,
timeframe: str = 'M5',
years: int = 10
) -> Dict:
"""
Sincronizacion completa de datos historicos
Args:
symbol: Par de trading
timeframe: Timeframe a descargar
years: Anos de historia
Returns:
Resumen de sincronizacion
"""
print(f"\n=== Full Sync: {symbol} {timeframe} ===")
end = datetime.utcnow()
start = end - timedelta(days=years * 365)
# Download all candles
candles = await self.api.get_candles_batch(symbol, timeframe, start, end)
print(f"Downloaded {len(candles)} candles")
# Validate and clean
valid_candles = self._validate_candles(candles)
print(f"Valid candles: {len(valid_candles)}")
# Store in database
inserted = await self._store_candles(symbol, timeframe, valid_candles)
print(f"Inserted {inserted} candles")
# Update sync status
await self._update_sync_status(symbol, timeframe, len(valid_candles))
return {
'symbol': symbol,
'timeframe': timeframe,
'downloaded': len(candles),
'valid': len(valid_candles),
'inserted': inserted,
'start': start.isoformat(),
'end': end.isoformat()
}
async def incremental_sync(
self,
symbol: str,
timeframe: str = 'M5'
) -> Dict:
"""
Sincronizacion incremental desde ultima vela
"""
# Get last candle time
last_time = await self._get_last_candle_time(symbol, timeframe)
if not last_time:
# No data, do full sync
return await self.full_sync(symbol, timeframe)
# Download from last candle to now
end = datetime.utcnow()
start = last_time + self.api._get_timeframe_delta(timeframe)
if start >= end:
return {'symbol': symbol, 'message': 'Already up to date'}
candles = await self.api.get_candles_batch(symbol, timeframe, start, end)
if candles:
valid_candles = self._validate_candles(candles)
inserted = await self._store_candles(symbol, timeframe, valid_candles)
await self._update_sync_status(symbol, timeframe, inserted)
return {
'symbol': symbol,
'timeframe': timeframe,
'new_candles': inserted,
'latest': candles[-1].timestamp.isoformat() if candles else None
}
return {'symbol': symbol, 'message': 'No new data'}
def _validate_candles(self, candles: List[Candle]) -> List[Candle]:
"""Valida y filtra candles"""
valid = []
for c in candles:
# Basic validation
if c.high < c.low:
continue
if c.open <= 0 or c.close <= 0:
continue
if c.high < max(c.open, c.close):
continue
if c.low > min(c.open, c.close):
continue
valid.append(c)
return valid
async def _store_candles(
self,
symbol: str,
timeframe: str,
candles: List[Candle]
) -> int:
"""Almacena candles en PostgreSQL"""
if not candles:
return 0
# Prepare data for bulk insert
records = [
(symbol, timeframe, c.timestamp, c.open, c.high, c.low, c.close, c.volume, 'api_massive')
for c in candles
]
# Bulk upsert
async with self.db.acquire() as conn:
result = await conn.executemany('''
INSERT INTO market_data (symbol, timeframe, timestamp, open, high, low, close, volume, source)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
ON CONFLICT (symbol, timeframe, timestamp)
DO UPDATE SET open = EXCLUDED.open, high = EXCLUDED.high,
low = EXCLUDED.low, close = EXCLUDED.close,
volume = EXCLUDED.volume
''', records)
return len(records)
async def _get_last_candle_time(self, symbol: str, timeframe: str) -> Optional[datetime]:
"""Obtiene timestamp de la ultima vela"""
async with self.db.acquire() as conn:
result = await conn.fetchval('''
SELECT MAX(timestamp) FROM market_data
WHERE symbol = $1 AND timeframe = $2
''', symbol, timeframe)
return result
async def _update_sync_status(self, symbol: str, timeframe: str, candles_added: int):
"""Actualiza estado de sincronizacion"""
async with self.db.acquire() as conn:
await conn.execute('''
INSERT INTO sync_status (symbol, timeframe, last_sync_time, total_candles, status)
VALUES ($1, $2, NOW(), $3, 'synced')
ON CONFLICT (symbol, timeframe)
DO UPDATE SET last_sync_time = NOW(),
total_candles = sync_status.total_candles + $3,
status = 'synced',
updated_at = NOW()
''', symbol, timeframe, candles_added)
```
---
## Gap Detection y Filling
### Gap Detector
```python
# services/gap_handler.py
from typing import List, Dict, Tuple
from datetime import datetime, timedelta
from dataclasses import dataclass
@dataclass
class DataGap:
start: datetime
end: datetime
candles_missing: int
duration_minutes: int
class GapHandler:
"""
Detecta y rellena gaps en los datos
"""
def __init__(self, db_pool, api_client: APIMassiveClient, config: Dict):
self.db = db_pool
self.api = api_client
self.config = config
async def detect_gaps(
self,
symbol: str,
timeframe: str,
start: Optional[datetime] = None,
end: Optional[datetime] = None
) -> List[DataGap]:
"""
Detecta gaps en los datos
Un gap es definido como mas de N candles faltantes consecutivos
donde N = config['min_gap_candles'] (default 3)
"""
min_gap_candles = self.config.get('min_gap_candles', 3)
timeframe_delta = self._get_timeframe_delta(timeframe)
# Get all timestamps
async with self.db.acquire() as conn:
query = '''
SELECT timestamp FROM market_data
WHERE symbol = $1 AND timeframe = $2
'''
params = [symbol, timeframe]
if start:
query += ' AND timestamp >= $3'
params.append(start)
if end:
query += f' AND timestamp <= ${len(params) + 1}'
params.append(end)
query += ' ORDER BY timestamp'
rows = await conn.fetch(query, *params)
if len(rows) < 2:
return []
# Detect gaps
gaps = []
timestamps = [row['timestamp'] for row in rows]
for i in range(1, len(timestamps)):
expected_delta = timeframe_delta
actual_delta = timestamps[i] - timestamps[i-1]
if actual_delta > expected_delta * min_gap_candles:
candles_missing = int(actual_delta / timeframe_delta) - 1
gaps.append(DataGap(
start=timestamps[i-1] + timeframe_delta,
end=timestamps[i] - timeframe_delta,
candles_missing=candles_missing,
duration_minutes=int(actual_delta.total_seconds() / 60)
))
# Log gaps to database
for gap in gaps:
await self._log_gap(symbol, timeframe, gap)
return gaps
async def fill_gaps(
self,
symbol: str,
timeframe: str,
max_gaps: int = 100
) -> Dict:
"""
Intenta rellenar gaps detectados
"""
# Get pending gaps
async with self.db.acquire() as conn:
rows = await conn.fetch('''
SELECT * FROM data_gaps
WHERE symbol = $1 AND timeframe = $2 AND status = 'detected'
ORDER BY gap_start
LIMIT $3
''', symbol, timeframe, max_gaps)
if not rows:
return {'message': 'No gaps to fill', 'filled': 0}
filled_count = 0
unfillable_count = 0
for row in rows:
gap_start = row['gap_start']
gap_end = row['gap_end']
try:
# Try to fetch missing data
candles = await self.api.get_candles(symbol, timeframe, gap_start, gap_end)
if candles:
# Store candles
await self._store_gap_candles(symbol, timeframe, candles)
# Mark gap as filled
await self._update_gap_status(row['id'], 'filled', len(candles))
filled_count += 1
else:
# No data available (likely market was closed)
await self._update_gap_status(row['id'], 'unfillable', 0)
unfillable_count += 1
except Exception as e:
print(f"Error filling gap {row['id']}: {e}")
continue
return {
'gaps_processed': len(rows),
'filled': filled_count,
'unfillable': unfillable_count
}
async def _log_gap(self, symbol: str, timeframe: str, gap: DataGap):
"""Registra gap en la base de datos"""
async with self.db.acquire() as conn:
await conn.execute('''
INSERT INTO data_gaps (symbol, timeframe, gap_start, gap_end, candles_missing, status)
VALUES ($1, $2, $3, $4, $5, 'detected')
ON CONFLICT (symbol, timeframe, gap_start)
DO UPDATE SET gap_end = EXCLUDED.gap_end,
candles_missing = EXCLUDED.candles_missing
''', symbol, timeframe, gap.start, gap.end, gap.candles_missing)
async def _update_gap_status(self, gap_id: int, status: str, candles_filled: int):
"""Actualiza estado de un gap"""
async with self.db.acquire() as conn:
await conn.execute('''
UPDATE data_gaps
SET status = $2, filled_at = NOW()
WHERE id = $1
''', gap_id, status)
async def _store_gap_candles(self, symbol: str, timeframe: str, candles: List[Candle]):
"""Almacena candles de gap"""
records = [
(symbol, timeframe, c.timestamp, c.open, c.high, c.low, c.close, c.volume, 'gap_fill')
for c in candles
]
async with self.db.acquire() as conn:
await conn.executemany('''
INSERT INTO market_data (symbol, timeframe, timestamp, open, high, low, close, volume, source)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
ON CONFLICT (symbol, timeframe, timestamp) DO NOTHING
''', records)
def _get_timeframe_delta(self, timeframe: str) -> timedelta:
mapping = {
'M1': timedelta(minutes=1),
'M5': timedelta(minutes=5),
'M15': timedelta(minutes=15),
'M30': timedelta(minutes=30),
'H1': timedelta(hours=1),
'H4': timedelta(hours=4),
'D1': timedelta(days=1)
}
return mapping.get(timeframe, timedelta(minutes=5))
```
---
## Sincronizacion con MT4
### Price Comparison Service
```python
# services/price_sync.py
from typing import Dict, List, Optional
from datetime import datetime, timedelta
class PriceSync:
"""
Sincroniza precios entre API Massive y brokers MT4
"""
def __init__(self, db_pool, metaapi_client, config: Dict):
self.db = db_pool
self.mt4 = metaapi_client
self.config = config
async def compare_prices(
self,
symbol: str,
account_id: str,
window_minutes: int = 60
) -> Dict:
"""
Compara precios de API Massive vs MT4
Returns:
{
'avg_difference': float,
'max_difference': float,
'std_difference': float,
'samples': int
}
"""
end = datetime.utcnow()
start = end - timedelta(minutes=window_minutes)
# Get API Massive data
async with self.db.acquire() as conn:
api_data = await conn.fetch('''
SELECT timestamp, close FROM market_data
WHERE symbol = $1 AND timeframe = 'M5'
AND timestamp BETWEEN $2 AND $3
ORDER BY timestamp
''', symbol, start, end)
if not api_data:
return {'error': 'No API data available'}
# Get MT4 current price
mt4_price = await self.mt4.get_symbol_price(account_id, symbol)
# Calculate differences
differences = []
for row in api_data:
diff = abs(row['close'] - mt4_price['bid'])
differences.append(diff)
import numpy as np
return {
'avg_difference': np.mean(differences),
'max_difference': np.max(differences),
'std_difference': np.std(differences),
'samples': len(differences),
'api_latest': api_data[-1]['close'],
'mt4_bid': mt4_price['bid'],
'mt4_ask': mt4_price['ask']
}
async def calculate_broker_offset(
self,
symbol: str,
account_id: str,
samples: int = 100
) -> Dict:
"""
Calcula offset promedio entre API y broker
Util para ajustar precios en ejecucion
"""
offsets = []
for _ in range(samples):
# Get current prices
mt4_price = await self.mt4.get_symbol_price(account_id, symbol)
# Get latest API candle
async with self.db.acquire() as conn:
api_close = await conn.fetchval('''
SELECT close FROM market_data
WHERE symbol = $1 AND timeframe = 'M5'
ORDER BY timestamp DESC LIMIT 1
''', symbol)
if api_close:
mid_price = (mt4_price['bid'] + mt4_price['ask']) / 2
offset = mid_price - float(api_close)
offsets.append(offset)
await asyncio.sleep(1) # 1 sample per second
import numpy as np
return {
'symbol': symbol,
'broker': account_id,
'avg_offset': np.mean(offsets),
'std_offset': np.std(offsets),
'samples': len(offsets),
'recommendation': {
'use_offset': abs(np.mean(offsets)) > 0.5, # > 0.5 pips
'offset_value': np.mean(offsets)
}
}
```
---
## API Endpoints
### Data Service API
```python
# api/data_api.py
from fastapi import FastAPI, HTTPException, BackgroundTasks
from pydantic import BaseModel
from typing import List, Optional
from datetime import datetime
app = FastAPI(title="Trading Platform Data Service")
class SyncRequest(BaseModel):
symbol: str
timeframe: str = "M5"
full_sync: bool = False
class DataQuery(BaseModel):
symbol: str
timeframe: str = "M5"
start: datetime
end: datetime
limit: int = 10000
@app.post("/api/data/sync")
async def sync_data(request: SyncRequest, background_tasks: BackgroundTasks):
"""Inicia sincronizacion de datos"""
if request.full_sync:
background_tasks.add_task(
data_fetcher.full_sync,
request.symbol,
request.timeframe
)
else:
background_tasks.add_task(
data_fetcher.incremental_sync,
request.symbol,
request.timeframe
)
return {"message": f"Sync started for {request.symbol}", "full_sync": request.full_sync}
@app.get("/api/data/candles")
async def get_candles(query: DataQuery):
"""Obtiene candles de la base de datos"""
async with db_pool.acquire() as conn:
rows = await conn.fetch('''
SELECT timestamp, open, high, low, close, volume
FROM market_data
WHERE symbol = $1 AND timeframe = $2
AND timestamp BETWEEN $3 AND $4
ORDER BY timestamp
LIMIT $5
''', query.symbol, query.timeframe, query.start, query.end, query.limit)
return {
"symbol": query.symbol,
"timeframe": query.timeframe,
"count": len(rows),
"candles": [dict(row) for row in rows]
}
@app.get("/api/data/status")
async def get_sync_status():
"""Estado de sincronizacion de todos los simbolos"""
async with db_pool.acquire() as conn:
rows = await conn.fetch('SELECT * FROM sync_status ORDER BY symbol')
return {"status": [dict(row) for row in rows]}
@app.get("/api/data/gaps/{symbol}")
async def get_gaps(symbol: str, timeframe: str = "M5"):
"""Lista gaps detectados"""
async with db_pool.acquire() as conn:
rows = await conn.fetch('''
SELECT * FROM data_gaps
WHERE symbol = $1 AND timeframe = $2
ORDER BY gap_start
''', symbol, timeframe)
return {"gaps": [dict(row) for row in rows]}
@app.post("/api/data/gaps/detect")
async def detect_gaps(symbol: str, timeframe: str = "M5"):
"""Detecta gaps en los datos"""
gaps = await gap_handler.detect_gaps(symbol, timeframe)
return {
"symbol": symbol,
"timeframe": timeframe,
"gaps_found": len(gaps),
"gaps": [
{
"start": g.start.isoformat(),
"end": g.end.isoformat(),
"missing": g.candles_missing
}
for g in gaps
]
}
@app.post("/api/data/gaps/fill")
async def fill_gaps(symbol: str, timeframe: str = "M5", background_tasks: BackgroundTasks):
"""Intenta rellenar gaps"""
background_tasks.add_task(gap_handler.fill_gaps, symbol, timeframe)
return {"message": f"Gap filling started for {symbol}"}
@app.get("/api/data/health")
async def health():
"""Health check"""
return {"status": "healthy"}
```
---
## Scheduling
### Celery Tasks
```python
# tasks/data_tasks.py
from celery import Celery
from celery.schedules import crontab
celery_app = Celery('trading_data')
celery_app.conf.beat_schedule = {
# Full sync once a day at 4 AM UTC
'full-sync-daily': {
'task': 'tasks.full_sync_all',
'schedule': crontab(hour=4, minute=0),
},
# Incremental sync every 5 minutes
'incremental-sync': {
'task': 'tasks.incremental_sync_all',
'schedule': crontab(minute='*/5'),
},
# Gap detection daily
'detect-gaps': {
'task': 'tasks.detect_gaps_all',
'schedule': crontab(hour=5, minute=0),
},
# Gap filling daily
'fill-gaps': {
'task': 'tasks.fill_gaps_all',
'schedule': crontab(hour=6, minute=0),
},
}
@celery_app.task
def full_sync_all():
"""Full sync de todos los simbolos"""
symbols = ['XAUUSD', 'EURUSD', 'GBPUSD', 'USDJPY']
for symbol in symbols:
full_sync_symbol.delay(symbol)
@celery_app.task
def full_sync_symbol(symbol: str, timeframe: str = 'M5'):
"""Full sync de un simbolo"""
import asyncio
loop = asyncio.get_event_loop()
loop.run_until_complete(data_fetcher.full_sync(symbol, timeframe))
@celery_app.task
def incremental_sync_all():
"""Incremental sync de todos los simbolos"""
symbols = ['XAUUSD', 'EURUSD', 'GBPUSD', 'USDJPY']
for symbol in symbols:
incremental_sync_symbol.delay(symbol)
@celery_app.task
def incremental_sync_symbol(symbol: str, timeframe: str = 'M5'):
"""Incremental sync de un simbolo"""
import asyncio
loop = asyncio.get_event_loop()
loop.run_until_complete(data_fetcher.incremental_sync(symbol, timeframe))
@celery_app.task
def detect_gaps_all():
"""Detecta gaps en todos los simbolos"""
symbols = ['XAUUSD', 'EURUSD', 'GBPUSD', 'USDJPY']
for symbol in symbols:
import asyncio
loop = asyncio.get_event_loop()
loop.run_until_complete(gap_handler.detect_gaps(symbol, 'M5'))
@celery_app.task
def fill_gaps_all():
"""Rellena gaps de todos los simbolos"""
symbols = ['XAUUSD', 'EURUSD', 'GBPUSD', 'USDJPY']
for symbol in symbols:
import asyncio
loop = asyncio.get_event_loop()
loop.run_until_complete(gap_handler.fill_gaps(symbol, 'M5'))
```
---
## Implementacion
### Docker Compose
```yaml
# docker-compose.data.yaml
version: '3.8'
services:
# =========================================================
# NOTA: Este servicio usa la instancia compartida de PostgreSQL
# del workspace (puerto 5438). NO crear instancia propia.
# Ver: /home/isem/workspace/core/devtools/environment/DEVENV-PORTS.md
# =========================================================
data-service:
build:
context: .
dockerfile: Dockerfile.data
container_name: trading-data
ports:
- "3604:3604" # Data service (base 3600 + 4)
environment:
- API_MASSIVE_KEY=${API_MASSIVE_KEY}
- DATABASE_URL=postgresql://trading_user:trading_dev_2025@postgres:5438/trading_data
- REDIS_URL=redis://redis:6385
- SERVICE_PORT=3604
depends_on:
- redis
networks:
- trading-network
restart: unless-stopped
celery-worker:
build:
context: .
dockerfile: Dockerfile.data
container_name: trading-celery
command: celery -A tasks worker -l info
environment:
- API_MASSIVE_KEY=${API_MASSIVE_KEY}
- DATABASE_URL=postgresql://trading_user:trading_dev_2025@postgres:5438/trading_data
- REDIS_URL=redis://redis:6385
networks:
- trading-network
restart: unless-stopped
celery-beat:
build:
context: .
dockerfile: Dockerfile.data
container_name: trading-beat
command: celery -A tasks beat -l info
environment:
- REDIS_URL=redis://redis:6385
depends_on:
- redis
networks:
- trading-network
restart: unless-stopped
# Redis dedicado para trading-platform (puerto 6385)
redis:
image: redis:7-alpine
container_name: trading-redis
ports:
- "6385:6379"
volumes:
- redis_data:/data
restart: unless-stopped
networks:
trading-network:
external: true # Red compartida del workspace
volumes:
redis_data:
```
---
**Documento Generado:** 2025-12-08
**Trading Strategist - Trading Platform**