trading-platform/docs/97-adr/ADR-005-devops.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

7.9 KiB

id title type project version updated_date
ADR-005-devops DevOps y CI/CD Documentation trading-platform 1.0.0 2026-01-04

ADR-004: DevOps y CI/CD

Estado: Aceptado Fecha: 2025-12-06 Decisores: Tech Lead, Arquitecto, DevOps Relacionado: ADR-001, ADR-002, ADR-003


Contexto

Trading Platform necesita una estrategia de DevOps que permita:

  1. Desarrollo Local Fácil: Onboarding de nuevos devs en < 15 minutos
  2. Continuous Integration: Tests automáticos en cada PR
  3. Continuous Deployment: Deploy automático a staging/production
  4. Ambientes Consistentes: "Works on my machine" no es aceptable
  5. Rollbacks Rápidos: Revertir deploys en < 5 minutos si hay issues
  6. Monitoreo: Logs, metrics, alerts para producción

El equipo tiene experiencia con GitHub Actions, Docker y despliegues en cloud.


Decisión

CI/CD Platform

GitHub Actions como plataforma principal

Ambientes

Ambiente Trigger Propósito Uptime SLA
Local Manual Desarrollo N/A
Staging Push a develop Testing interno, demos 95%
Production Tag release Usuarios finales 99.5%

Local Development

Docker Compose para servicios locales:

# docker-compose.yml
services:
  postgres:
    image: postgres:15-alpine
    ports: ["5432:5432"]
    environment:
      POSTGRES_DB: trading
      POSTGRES_USER: dev
      POSTGRES_PASSWORD: dev123
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports: ["6379:6379"]

  backend:
    build: ./apps/backend
    ports: ["3000:3000"]
    depends_on: [postgres, redis]
    environment:
      DATABASE_URL: postgresql://dev:dev123@postgres:5432/trading
      REDIS_URL: redis://redis:6379

  frontend:
    build: ./apps/frontend
    ports: ["5173:5173"]
    volumes:
      - ./apps/frontend:/app
      - /app/node_modules

  ml-engine:
    build: ./apps/ml-engine
    ports: ["8000:8000"]

Setup Script:

#!/bin/bash
# scripts/dev.sh
docker-compose up -d postgres redis
cd apps/backend && npm run dev &
cd apps/frontend && npm run dev &
cd apps/ml-engine && uvicorn main:app --reload

CI/CD Pipeline

# .github/workflows/ci.yml
name: CI Pipeline

on:
  pull_request:
  push:
    branches: [main, develop]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'
      - run: npm ci
      - run: npm run lint

  test:
    runs-on: ubuntu-latest
    needs: lint
    strategy:
      matrix:
        workspace: [frontend, backend, ml-engine]
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run test -w apps/${{ matrix.workspace }}
      - run: npm run test:coverage -w apps/${{ matrix.workspace }}
      - uses: codecov/codecov-action@v3

  build:
    runs-on: ubuntu-latest
    needs: test
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run build --workspaces
      - uses: actions/upload-artifact@v3
        with:
          name: build-artifacts
          path: |
            apps/frontend/dist
            apps/backend/dist            

  deploy-staging:
    if: github.ref == 'refs/heads/develop'
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/download-artifact@v3
      - name: Deploy to Staging
        run: ./scripts/deploy-staging.sh

  deploy-production:
    if: startsWith(github.ref, 'refs/tags/v')
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/download-artifact@v3
      - name: Deploy to Production
        run: ./scripts/deploy-production.sh

Pipeline Stages

┌─────────┐     ┌──────┐     ┌───────┐     ┌────────┐
│  LINT   │────▶│ TEST │────▶│ BUILD │────▶│ DEPLOY │
└─────────┘     └──────┘     └───────┘     └────────┘
   2 min         5 min         3 min         2 min

Total Time: ~12 minutos (con caching)

Deployment Strategy

Blue-Green Deployment para producción:

  1. Deploy nueva versión a "green" environment
  2. Health checks automáticos
  3. Switch traffic de "blue" a "green"
  4. Keep "blue" por 24h para rollback rápido

Infrastructure as Code

No IaC tool inicial (Terraform/Pulumi):

  • Fase MVP: Deploy manual a cloud provider
  • Post-MVP: Migrar a Terraform cuando infraestructura se estabilice

Consecuencias

Positivas

  1. Fast Onboarding: npm install && docker-compose up inicia todo
  2. Confidence: Tests automáticos previenen regressions
  3. Speed: Deploy a staging en cada commit a develop
  4. Rollback Fácil: Blue-green permite revertir en < 5 min
  5. Consistency: Docker garantiza mismo ambiente local/staging/prod
  6. Free Tier: GitHub Actions es gratis para repos públicos (2000 min/mes)
  7. Matrix Tests: Paralelización de tests por workspace

Negativas

  1. GitHub Lock-in: Difícil migrar a GitLab/Bitbucket
  2. Docker Overhead: Consume recursos en máquinas dev
  3. Secrets Management: GitHub Secrets no es tan robusto como Vault
  4. Cold Starts: Docker Compose inicial puede tardar ~2 min
  5. Manual IaC: Sin Terraform, infraestructura es manual

Riesgos y Mitigaciones

Riesgo Mitigación
GitHub Actions down Cache local de artifacts, manual deploy
Secrets leak Pre-commit hook con gitleaks
Deploy failures Health checks + auto-rollback
Slow CI Aggressive caching + parallel jobs

Alternativas Consideradas

1. GitLab CI/CD

  • Pros: CI/CD integrado, self-hosted option
  • Contras: Menos integración con GitHub
  • Decisión: Descartada - Ya usamos GitHub para repos

2. CircleCI

  • Pros: Muy rápido, buen caching
  • Contras: Pricing caro, vendor lock-in
  • Decisión: Descartada - GitHub Actions es suficiente y gratis

3. Jenkins

  • Pros: Self-hosted, muy flexible
  • Contras: Mantenimiento complejo, requiere servidor dedicado
  • Decisión: Descartada - Overhead de mantenimiento

4. Kubernetes para Local Dev

  • Pros: Paridad total con producción
  • Contras: Complejidad excesiva, curva de aprendizaje alta
  • Decisión: Descartada - Docker Compose es suficiente para MVP

5. Terraform Inmediato

  • Pros: IaC desde día 1
  • Contras: Ralentiza MVP, infraestructura aún cambia mucho
  • Decisión: Pospuesta - Implementar post-MVP cuando infra se estabilice

6. Canary Deployments

  • Pros: Gradual rollout, menos riesgo
  • Contras: Complejidad de routing, requiere service mesh
  • Decisión: Descartada - Blue-green es suficiente para fase inicial

Deployment Checklist

Pre-Deploy

  • Tests passing (70-80% coverage)
  • Linting clean (no warnings)
  • Migrations reviewed (backward compatible)
  • Secrets rotated si es necesario
  • Changelog actualizado

Deploy

  • Health checks green
  • Database migrations ejecutadas
  • Cache invalidado (Redis FLUSHDB en staging)
  • Logs monitoreados por 15 min post-deploy

Post-Deploy

  • Smoke tests E2E
  • Metrics dashboard revisado
  • Rollback plan confirmado
  • Team notificado en Slack

Monitoring y Observability (Post-MVP)

# Future ADR candidates
logging: Winston/Pino → CloudWatch/Datadog
metrics: Prometheus → Grafana
errors: Sentry
uptime: UptimeRobot
apm: New Relic / Datadog APM

Referencias