- Created full CAPVED folder with METADATA, 01-06 phases, and SUMMARY - Updated _INDEX.yml with new task entry - Documents: Polygon data loading, MySQL→PostgreSQL migration, 12 attention models Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
131 lines
3.2 KiB
Markdown
131 lines
3.2 KiB
Markdown
# 05-EJECUCION - ML Data Migration & Model Training
|
|
|
|
## Fecha: 2026-01-25
|
|
## Fase: EJECUCION (E)
|
|
## Estado: COMPLETADA
|
|
|
|
---
|
|
|
|
## 1. Ambiente Python
|
|
|
|
### 1.1 Creacion de Virtual Environment
|
|
```bash
|
|
# Crear venv en Linux home (evitar cross-filesystem)
|
|
wsl -d Ubuntu-24.04 -u developer -- python3 -m venv ~/venvs/data-service
|
|
|
|
# Instalar dependencias
|
|
wsl -d Ubuntu-24.04 -u developer -- ~/venvs/data-service/bin/pip install \
|
|
aiohttp asyncpg pandas numpy python-dotenv structlog
|
|
```
|
|
|
|
### 1.2 Dependencias ML
|
|
```bash
|
|
wsl -d Ubuntu-24.04 -u developer -- ~/venvs/data-service/bin/pip install \
|
|
xgboost scikit-learn joblib sqlalchemy pyyaml loguru psycopg2-binary
|
|
```
|
|
|
|
---
|
|
|
|
## 2. Carga de Datos desde Polygon
|
|
|
|
### 2.1 Script Creado: `apps/data-service/scripts/fetch_polygon_data.py`
|
|
|
|
Funcionalidades:
|
|
- Async con aiohttp para requests a Polygon API
|
|
- Rate limiting (5 req/min)
|
|
- Batch inserts con asyncpg
|
|
- ON CONFLICT handling para upserts
|
|
- Normalizacion de timezones
|
|
|
|
### 2.2 Ejecucion
|
|
```bash
|
|
cd /mnt/c/Empresas/ISEM/workspace-v2/projects/trading-platform/apps/data-service
|
|
~/venvs/data-service/bin/python scripts/fetch_polygon_data.py
|
|
```
|
|
|
|
### 2.3 Resultado
|
|
- Tiempo total: ~2 horas (rate limit)
|
|
- Bars cargados: 469,217
|
|
- Sin errores
|
|
|
|
---
|
|
|
|
## 3. Migracion ML Engine a PostgreSQL
|
|
|
|
### 3.1 Archivos Creados
|
|
|
|
**`apps/ml-engine/src/data/database.py`** (356 lineas)
|
|
- `PostgreSQLConnection` class
|
|
- Metodos: `get_ticker_data()`, `execute_query()`, `get_all_tickers()`
|
|
- Traduccion automatica MySQL→PostgreSQL
|
|
- Alias `MySQLConnection` para compatibilidad
|
|
|
|
**`apps/ml-engine/src/data/__init__.py`**
|
|
- Exports: DatabaseManager, PostgreSQLConnection, load_ohlcv_from_postgres
|
|
|
|
### 3.2 Configuracion Actualizada
|
|
|
|
**`apps/ml-engine/config/database.yaml`**
|
|
```yaml
|
|
postgres:
|
|
host: localhost
|
|
port: 5432
|
|
database: trading_platform
|
|
user: trading_user
|
|
password: trading_dev_2026
|
|
|
|
mysql:
|
|
_deprecated: true
|
|
```
|
|
|
|
**`apps/ml-engine/.env`**
|
|
```
|
|
DB_HOST=localhost
|
|
DB_PORT=5432
|
|
DB_NAME=trading_platform
|
|
DB_USER=trading_user
|
|
DB_PASSWORD=trading_dev_2026
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Entrenamiento de Modelos
|
|
|
|
### 4.1 Ejecucion
|
|
```bash
|
|
cd /mnt/c/Empresas/ISEM/workspace-v2/projects/trading-platform/apps/ml-engine
|
|
~/venvs/data-service/bin/python -m training.train_attention_models
|
|
```
|
|
|
|
### 4.2 Resultado
|
|
- 12 modelos entrenados (6 symbols x 2 timeframes)
|
|
- Cada modelo: regressor + classifier + metadata
|
|
- Reporte: `ATTENTION_TRAINING_REPORT_20260125_060911.md`
|
|
|
|
---
|
|
|
|
## 5. Commits Realizados
|
|
|
|
| Repo | Hash | Mensaje |
|
|
|------|------|---------|
|
|
| trading-platform | ffee190 | docs: Update DATABASE/ML_INVENTORY |
|
|
| ml-engine-v2 | 475e913 | config: Update database.yaml |
|
|
| data-service-v2 | 0e20c7c | feat: Add Polygon fetch script |
|
|
| workspace-v2 | 9b9ca7b0 | chore: Update submodules |
|
|
|
|
---
|
|
|
|
## 6. Problemas Resueltos
|
|
|
|
### 6.1 PEP 668 Restriction
|
|
- **Error:** "externally-managed-environment"
|
|
- **Solucion:** Usar venv en lugar de pip global
|
|
|
|
### 6.2 Cross-Filesystem Venv
|
|
- **Error:** venv en /mnt/c no funcionaba correctamente
|
|
- **Solucion:** Crear venv en ~/venvs/ (Linux nativo)
|
|
|
|
### 6.3 Timezone Comparison
|
|
- **Error:** "can't compare offset-naive and offset-aware datetimes"
|
|
- **Solucion:** `.replace(tzinfo=None)` en timestamps de PostgreSQL
|