# 05-EJECUCION - ML Data Migration & Model Training ## Fecha: 2026-01-25 ## Fase: EJECUCION (E) ## Estado: COMPLETADA --- ## 1. Ambiente Python ### 1.1 Creacion de Virtual Environment ```bash # Crear venv en Linux home (evitar cross-filesystem) wsl -d Ubuntu-24.04 -u developer -- python3 -m venv ~/venvs/data-service # Instalar dependencias wsl -d Ubuntu-24.04 -u developer -- ~/venvs/data-service/bin/pip install \ aiohttp asyncpg pandas numpy python-dotenv structlog ``` ### 1.2 Dependencias ML ```bash wsl -d Ubuntu-24.04 -u developer -- ~/venvs/data-service/bin/pip install \ xgboost scikit-learn joblib sqlalchemy pyyaml loguru psycopg2-binary ``` --- ## 2. Carga de Datos desde Polygon ### 2.1 Script Creado: `apps/data-service/scripts/fetch_polygon_data.py` Funcionalidades: - Async con aiohttp para requests a Polygon API - Rate limiting (5 req/min) - Batch inserts con asyncpg - ON CONFLICT handling para upserts - Normalizacion de timezones ### 2.2 Ejecucion ```bash cd /mnt/c/Empresas/ISEM/workspace-v2/projects/trading-platform/apps/data-service ~/venvs/data-service/bin/python scripts/fetch_polygon_data.py ``` ### 2.3 Resultado - Tiempo total: ~2 horas (rate limit) - Bars cargados: 469,217 - Sin errores --- ## 3. Migracion ML Engine a PostgreSQL ### 3.1 Archivos Creados **`apps/ml-engine/src/data/database.py`** (356 lineas) - `PostgreSQLConnection` class - Metodos: `get_ticker_data()`, `execute_query()`, `get_all_tickers()` - Traduccion automatica MySQL→PostgreSQL - Alias `MySQLConnection` para compatibilidad **`apps/ml-engine/src/data/__init__.py`** - Exports: DatabaseManager, PostgreSQLConnection, load_ohlcv_from_postgres ### 3.2 Configuracion Actualizada **`apps/ml-engine/config/database.yaml`** ```yaml postgres: host: localhost port: 5432 database: trading_platform user: trading_user password: trading_dev_2026 mysql: _deprecated: true ``` **`apps/ml-engine/.env`** ``` DB_HOST=localhost DB_PORT=5432 DB_NAME=trading_platform DB_USER=trading_user DB_PASSWORD=trading_dev_2026 ``` --- ## 4. Entrenamiento de Modelos ### 4.1 Ejecucion ```bash cd /mnt/c/Empresas/ISEM/workspace-v2/projects/trading-platform/apps/ml-engine ~/venvs/data-service/bin/python -m training.train_attention_models ``` ### 4.2 Resultado - 12 modelos entrenados (6 symbols x 2 timeframes) - Cada modelo: regressor + classifier + metadata - Reporte: `ATTENTION_TRAINING_REPORT_20260125_060911.md` --- ## 5. Commits Realizados | Repo | Hash | Mensaje | |------|------|---------| | trading-platform | ffee190 | docs: Update DATABASE/ML_INVENTORY | | ml-engine-v2 | 475e913 | config: Update database.yaml | | data-service-v2 | 0e20c7c | feat: Add Polygon fetch script | | workspace-v2 | 9b9ca7b0 | chore: Update submodules | --- ## 6. Problemas Resueltos ### 6.1 PEP 668 Restriction - **Error:** "externally-managed-environment" - **Solucion:** Usar venv en lugar de pip global ### 6.2 Cross-Filesystem Venv - **Error:** venv en /mnt/c no funcionaba correctamente - **Solucion:** Crear venv en ~/venvs/ (Linux nativo) ### 6.3 Timezone Comparison - **Error:** "can't compare offset-naive and offset-aware datetimes" - **Solucion:** `.replace(tzinfo=None)` en timestamps de PostgreSQL