- Created full CAPVED folder with METADATA, 01-06 phases, and SUMMARY - Updated _INDEX.yml with new task entry - Documents: Polygon data loading, MySQL→PostgreSQL migration, 12 attention models Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.2 KiB
3.2 KiB
05-EJECUCION - ML Data Migration & Model Training
Fecha: 2026-01-25
Fase: EJECUCION (E)
Estado: COMPLETADA
1. Ambiente Python
1.1 Creacion de Virtual Environment
# Crear venv en Linux home (evitar cross-filesystem)
wsl -d Ubuntu-24.04 -u developer -- python3 -m venv ~/venvs/data-service
# Instalar dependencias
wsl -d Ubuntu-24.04 -u developer -- ~/venvs/data-service/bin/pip install \
aiohttp asyncpg pandas numpy python-dotenv structlog
1.2 Dependencias ML
wsl -d Ubuntu-24.04 -u developer -- ~/venvs/data-service/bin/pip install \
xgboost scikit-learn joblib sqlalchemy pyyaml loguru psycopg2-binary
2. Carga de Datos desde Polygon
2.1 Script Creado: apps/data-service/scripts/fetch_polygon_data.py
Funcionalidades:
- Async con aiohttp para requests a Polygon API
- Rate limiting (5 req/min)
- Batch inserts con asyncpg
- ON CONFLICT handling para upserts
- Normalizacion de timezones
2.2 Ejecucion
cd /mnt/c/Empresas/ISEM/workspace-v2/projects/trading-platform/apps/data-service
~/venvs/data-service/bin/python scripts/fetch_polygon_data.py
2.3 Resultado
- Tiempo total: ~2 horas (rate limit)
- Bars cargados: 469,217
- Sin errores
3. Migracion ML Engine a PostgreSQL
3.1 Archivos Creados
apps/ml-engine/src/data/database.py (356 lineas)
PostgreSQLConnectionclass- Metodos:
get_ticker_data(),execute_query(),get_all_tickers() - Traduccion automatica MySQL→PostgreSQL
- Alias
MySQLConnectionpara compatibilidad
apps/ml-engine/src/data/__init__.py
- Exports: DatabaseManager, PostgreSQLConnection, load_ohlcv_from_postgres
3.2 Configuracion Actualizada
apps/ml-engine/config/database.yaml
postgres:
host: localhost
port: 5432
database: trading_platform
user: trading_user
password: trading_dev_2026
mysql:
_deprecated: true
apps/ml-engine/.env
DB_HOST=localhost
DB_PORT=5432
DB_NAME=trading_platform
DB_USER=trading_user
DB_PASSWORD=trading_dev_2026
4. Entrenamiento de Modelos
4.1 Ejecucion
cd /mnt/c/Empresas/ISEM/workspace-v2/projects/trading-platform/apps/ml-engine
~/venvs/data-service/bin/python -m training.train_attention_models
4.2 Resultado
- 12 modelos entrenados (6 symbols x 2 timeframes)
- Cada modelo: regressor + classifier + metadata
- Reporte:
ATTENTION_TRAINING_REPORT_20260125_060911.md
5. Commits Realizados
| Repo | Hash | Mensaje |
|---|---|---|
| trading-platform | ffee190 |
docs: Update DATABASE/ML_INVENTORY |
| ml-engine-v2 | 475e913 | config: Update database.yaml |
| data-service-v2 | 0e20c7c | feat: Add Polygon fetch script |
| workspace-v2 | 9b9ca7b0 | chore: Update submodules |
6. Problemas Resueltos
6.1 PEP 668 Restriction
- Error: "externally-managed-environment"
- Solucion: Usar venv en lugar de pip global
6.2 Cross-Filesystem Venv
- Error: venv en /mnt/c no funcionaba correctamente
- Solucion: Crear venv en ~/venvs/ (Linux nativo)
6.3 Timezone Comparison
- Error: "can't compare offset-naive and offset-aware datetimes"
- Solucion:
.replace(tzinfo=None)en timestamps de PostgreSQL