trading-platform/docs/95-guias-desarrollo/JENKINS-DEPLOY.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

35 KiB

id title type project version updated_date
JENKINS-DEPLOY Jenkins CI/CD - Configuracion de Deploy Documentation trading-platform 1.0.0 2026-01-04

Jenkins CI/CD - Configuracion de Deploy

Version: 1.0.0 Fecha: 2025-12-05 Autor: Agente de Documentacion y Planificacion Estado: Aprobado


1. Resumen Ejecutivo

Este documento define la estrategia completa de CI/CD para Trading Platform usando Jenkins, incluyendo:

  • Pipelines por cada servicio
  • Variables de entorno necesarias
  • Puertos de produccion asignados
  • Estrategia de deploy (blue-green, canary, rolling)
  • Monitoreo post-deploy

2. Arquitectura de Deploy

┌──────────────────────────────────────────────────────────────────────────┐
│                            JENKINS SERVER                                 │
│                         jenkins.trading.com                             │
│                                                                           │
│  ┌────────────────────────────────────────────────────────────────────┐  │
│  │                         PIPELINES                                   │  │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────┐ │  │
│  │  │ Frontend │  │ Backend  │  │ML Engine │  │   Data   │  │  DB  │ │  │
│  │  │  Build   │  │  Build   │  │  Deploy  │  │ Service  │  │ Mig  │ │  │
│  │  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘  └──┬───┘ │  │
│  └───────┼─────────────┼─────────────┼─────────────┼────────────┼─────┘  │
│          │             │             │             │            │         │
└──────────┼─────────────┼─────────────┼─────────────┼────────────┼─────────┘
           │             │             │             │            │
           ▼             ▼             ▼             ▼            ▼
┌──────────────────────────────────────────────────────────────────────────┐
│                        PRODUCTION ENVIRONMENT                             │
│                       prod.trading.com                                  │
│                                                                           │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌─────┐│
│  │ Cloudflare │  │   Docker   │  │   Docker   │  │   Docker   │  │ PG  ││
│  │   Pages    │  │  Backend   │  │ ML Engine  │  │    Data    │  │ 15  ││
│  │            │  │  :3001     │  │  :8000     │  │  :8001     │  │:5432││
│  │ Frontend   │  │            │  │            │  │            │  │     ││
│  │   Assets   │  │  Node.js   │  │  Python    │  │  Python    │  │Redis││
│  └────────────┘  └────────────┘  └────────────┘  └────────────┘  │:6379││
│                                                                   └─────┘│
│                                                                           │
│  Load Balancer: Nginx (ports 80/443 → 3001)                              │
│  SSL/TLS: Let's Encrypt (auto-renew)                                     │
│  Monitoring: Prometheus + Grafana                                        │
└──────────────────────────────────────────────────────────────────────────┘

3. Pipelines por Servicio

3.1 Frontend Pipeline (Cloudflare Pages)

Jenkinsfile: apps/frontend/Jenkinsfile

pipeline {
    agent any

    environment {
        CLOUDFLARE_API_TOKEN = credentials('cloudflare-api-token')
        PROJECT_NAME = 'trading-frontend'
        VITE_API_URL = 'https://api.trading.com'
        VITE_WS_URL = 'wss://api.trading.com'
        VITE_STRIPE_PUBLIC_KEY = credentials('stripe-public-key')
    }

    stages {
        stage('Checkout') {
            steps {
                git branch: 'main',
                    url: 'https://github.com/trading/trading-platform.git'
            }
        }

        stage('Install Dependencies') {
            steps {
                dir('apps/frontend') {
                    sh 'npm ci'
                }
            }
        }

        stage('Lint') {
            steps {
                dir('apps/frontend') {
                    sh 'npm run lint'
                }
            }
        }

        stage('Type Check') {
            steps {
                dir('apps/frontend') {
                    sh 'npm run type-check'
                }
            }
        }

        stage('Unit Tests') {
            steps {
                dir('apps/frontend') {
                    sh 'npm run test:ci'
                }
            }
        }

        stage('Build') {
            steps {
                dir('apps/frontend') {
                    sh 'npm run build'
                }
            }
        }

        stage('E2E Tests (Staging)') {
            when {
                branch 'main'
            }
            steps {
                dir('apps/frontend') {
                    sh 'npm run test:e2e:ci'
                }
            }
        }

        stage('Deploy to Cloudflare') {
            when {
                branch 'main'
            }
            steps {
                dir('apps/frontend') {
                    sh '''
                        npx wrangler pages deploy dist \
                            --project-name=$PROJECT_NAME \
                            --branch=production
                    '''
                }
            }
        }

        stage('Smoke Tests') {
            when {
                branch 'main'
            }
            steps {
                sh '''
                    curl -f https://trading.com/health || exit 1
                '''
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "Frontend deployed successfully: ${env.BUILD_URL}"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "Frontend deploy FAILED: ${env.BUILD_URL}"
            )
        }
        always {
            cleanWs()
        }
    }
}

Triggers:

  • Push a main branch
  • Manual trigger
  • Cron: Nightly build H 2 * * *

Duration: ~5-8 minutos


3.2 Backend API Pipeline (Docker)

Jenkinsfile: apps/backend/Jenkinsfile

pipeline {
    agent any

    environment {
        DOCKER_REGISTRY = 'registry.trading.com'
        IMAGE_NAME = 'backend-api'
        IMAGE_TAG = "${env.BUILD_NUMBER}"
        DATABASE_URL = credentials('production-database-url')
        REDIS_URL = credentials('production-redis-url')
        JWT_SECRET = credentials('jwt-secret')
        STRIPE_SECRET_KEY = credentials('stripe-secret-key')
        STRIPE_WEBHOOK_SECRET = credentials('stripe-webhook-secret')
        ML_ENGINE_URL = 'http://ml-engine:8000'
        CLAUDE_API_KEY = credentials('claude-api-key')
    }

    stages {
        stage('Checkout') {
            steps {
                git branch: 'main',
                    url: 'https://github.com/trading/trading-platform.git'
            }
        }

        stage('Install Dependencies') {
            steps {
                dir('apps/backend') {
                    sh 'npm ci'
                }
            }
        }

        stage('Lint') {
            steps {
                dir('apps/backend') {
                    sh 'npm run lint'
                }
            }
        }

        stage('Type Check') {
            steps {
                dir('apps/backend') {
                    sh 'npm run type-check'
                }
            }
        }

        stage('Unit Tests') {
            steps {
                dir('apps/backend') {
                    sh 'npm run test:ci'
                }
            }
        }

        stage('Build TypeScript') {
            steps {
                dir('apps/backend') {
                    sh 'npm run build'
                }
            }
        }

        stage('Integration Tests') {
            steps {
                dir('apps/backend') {
                    sh '''
                        docker-compose -f docker-compose.test.yml up -d
                        npm run test:integration
                        docker-compose -f docker-compose.test.yml down
                    '''
                }
            }
        }

        stage('Build Docker Image') {
            steps {
                dir('apps/backend') {
                    sh '''
                        docker build \
                            -t $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
                            -t $DOCKER_REGISTRY/$IMAGE_NAME:latest \
                            .
                    '''
                }
            }
        }

        stage('Security Scan') {
            steps {
                sh '''
                    docker run --rm \
                        -v /var/run/docker.sock:/var/run/docker.sock \
                        aquasec/trivy image \
                        $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
                '''
            }
        }

        stage('Push to Registry') {
            steps {
                sh '''
                    docker push $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
                    docker push $DOCKER_REGISTRY/$IMAGE_NAME:latest
                '''
            }
        }

        stage('Deploy - Blue/Green') {
            when {
                branch 'main'
            }
            steps {
                script {
                    // Determinar color actual
                    def currentColor = sh(
                        script: '''
                            curl -s http://prod-lb.internal/health | \
                            jq -r '.color' || echo 'blue'
                        ''',
                        returnStdout: true
                    ).trim()

                    def newColor = currentColor == 'blue' ? 'green' : 'blue'

                    echo "Current: $currentColor, Deploying: $newColor"

                    // Deploy a nuevo color
                    sh """
                        docker-compose -f docker-compose.prod.yml \
                            -p backend-$newColor \
                            up -d --force-recreate
                    """

                    // Health check
                    sh """
                        for i in {1..30}; do
                            if curl -f http://backend-$newColor:3001/health; then
                                echo "Backend $newColor is healthy"
                                break
                            fi
                            echo "Waiting for backend $newColor... ($i/30)"
                            sleep 2
                        done
                    """

                    // Switch load balancer
                    sh """
                        curl -X POST http://prod-lb.internal/switch \
                            -d '{"target": "$newColor"}'
                    """

                    // Wait for traffic drain
                    sleep 10

                    // Stop old version
                    sh """
                        docker-compose -f docker-compose.prod.yml \
                            -p backend-$currentColor \
                            down
                    """
                }
            }
        }

        stage('Smoke Tests') {
            when {
                branch 'main'
            }
            steps {
                sh '''
                    curl -f https://api.trading.com/health || exit 1
                    curl -f https://api.trading.com/health/detailed || exit 1
                '''
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "Backend API deployed successfully: Build #${env.BUILD_NUMBER}"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "Backend API deploy FAILED: ${env.BUILD_URL}"
            )
            // Rollback automático
            script {
                sh '''
                    docker-compose -f docker-compose.prod.yml \
                        -p backend-blue \
                        up -d --force-recreate
                '''
            }
        }
        always {
            junit 'apps/backend/coverage/junit.xml'
            publishHTML([
                reportDir: 'apps/backend/coverage',
                reportFiles: 'index.html',
                reportName: 'Coverage Report'
            ])
            cleanWs()
        }
    }
}

Triggers:

  • Push a main branch
  • Manual trigger

Duration: ~10-15 minutos

Rollback Strategy: Blue/Green deployment con rollback automatico en fallo


3.3 ML Engine Pipeline (Docker GPU)

Jenkinsfile: apps/ml-engine/Jenkinsfile

pipeline {
    agent {
        label 'gpu-node'  // Nodo con GPU
    }

    environment {
        DOCKER_REGISTRY = 'registry.trading.com'
        IMAGE_NAME = 'ml-engine'
        IMAGE_TAG = "${env.BUILD_NUMBER}"
        DATABASE_URL = credentials('production-database-url')
        REDIS_URL = credentials('production-redis-url')
        MODEL_PATH = '/app/models'
    }

    stages {
        stage('Checkout') {
            steps {
                git branch: 'main',
                    url: 'https://github.com/trading/trading-platform.git'
            }
        }

        stage('Validate Models') {
            steps {
                dir('apps/ml-engine') {
                    sh '''
                        # Verificar que modelos existen
                        test -f models/phase2/range_predictor/15m/model_high.json
                        test -f models/phase2/range_predictor/15m/model_low.json
                        test -f models/phase2/tpsl_classifier/15m_rr_2_1.json
                    '''
                }
            }
        }

        stage('Unit Tests') {
            steps {
                dir('apps/ml-engine') {
                    sh '''
                        python3 -m venv venv
                        source venv/bin/activate
                        pip install -r requirements.txt
                        pytest tests/ -v --cov=app
                    '''
                }
            }
        }

        stage('Build Docker Image') {
            steps {
                dir('apps/ml-engine') {
                    sh '''
                        docker build \
                            -t $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
                            -t $DOCKER_REGISTRY/$IMAGE_NAME:latest \
                            -f Dockerfile.gpu \
                            .
                    '''
                }
            }
        }

        stage('Test Prediction') {
            steps {
                sh '''
                    docker run --rm --gpus all \
                        -e DATABASE_URL=$DATABASE_URL \
                        $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
                        python -m app.test_prediction
                '''
            }
        }

        stage('Push to Registry') {
            steps {
                sh '''
                    docker push $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
                    docker push $DOCKER_REGISTRY/$IMAGE_NAME:latest
                '''
            }
        }

        stage('Deploy - Canary') {
            when {
                branch 'main'
            }
            steps {
                script {
                    // Deploy canary (10% traffic)
                    sh '''
                        docker-compose -f docker-compose.prod.yml \
                            -p ml-engine-canary \
                            up -d --force-recreate
                    '''

                    // Monitor metrics por 5 minutos
                    echo "Monitoring canary metrics..."
                    sleep 300

                    def errorRate = sh(
                        script: '''
                            curl -s http://prometheus:9090/api/v1/query \
                                --data 'query=rate(ml_prediction_errors[5m])' | \
                            jq -r '.data.result[0].value[1]'
                        ''',
                        returnStdout: true
                    ).trim().toFloat()

                    if (errorRate > 0.05) {
                        error("Canary error rate too high: ${errorRate}")
                    }

                    // Promote to 100%
                    sh '''
                        docker-compose -f docker-compose.prod.yml \
                            -p ml-engine \
                            up -d --force-recreate

                        # Stop canary
                        docker-compose -f docker-compose.prod.yml \
                            -p ml-engine-canary \
                            down
                    '''
                }
            }
        }

        stage('Smoke Tests') {
            when {
                branch 'main'
            }
            steps {
                sh '''
                    curl -f http://ml-engine:8000/health || exit 1
                    curl -X POST http://ml-engine:8000/predictions \
                        -H "X-API-Key: $ML_API_KEY" \
                        -d '{"symbol": "BTCUSDT", "horizon": 18}' || exit 1
                '''
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "ML Engine deployed successfully"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "ML Engine deploy FAILED - Rolling back"
            )
            // Rollback
            sh '''
                docker-compose -f docker-compose.prod.yml \
                    -p ml-engine-canary \
                    down
            '''
        }
        always {
            cleanWs()
        }
    }
}

Triggers:

  • Manual (requiere aprobacion)
  • Scheduled: Weekly retrain H 0 * * 0

Duration: ~20-30 minutos

Deploy Strategy: Canary (10% traffic → 100% si metricas OK)


3.4 Data Service Pipeline

Jenkinsfile: apps/data-service/Jenkinsfile

pipeline {
    agent any

    environment {
        DOCKER_REGISTRY = 'registry.trading.com'
        IMAGE_NAME = 'data-service'
        IMAGE_TAG = "${env.BUILD_NUMBER}"
        DATABASE_URL = credentials('production-database-url')
        POLYGON_API_KEY = credentials('polygon-api-key')
        METAAPI_TOKEN = credentials('metaapi-token')
    }

    stages {
        stage('Checkout') {
            steps {
                git branch: 'main',
                    url: 'https://github.com/trading/trading-platform.git'
            }
        }

        stage('Unit Tests') {
            steps {
                dir('apps/data-service') {
                    sh '''
                        python3 -m venv venv
                        source venv/bin/activate
                        pip install -r requirements.txt
                        pytest tests/ -v
                    '''
                }
            }
        }

        stage('Build Docker Image') {
            steps {
                dir('apps/data-service') {
                    sh '''
                        docker build \
                            -t $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
                            -t $DOCKER_REGISTRY/$IMAGE_NAME:latest \
                            .
                    '''
                }
            }
        }

        stage('Push to Registry') {
            steps {
                sh '''
                    docker push $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
                    docker push $DOCKER_REGISTRY/$IMAGE_NAME:latest
                '''
            }
        }

        stage('Deploy - Rolling Update') {
            when {
                branch 'main'
            }
            steps {
                sh '''
                    docker-compose -f docker-compose.prod.yml \
                        up -d --no-deps --force-recreate data-service
                '''
            }
        }

        stage('Verify Sync') {
            when {
                branch 'main'
            }
            steps {
                sh '''
                    sleep 60
                    curl -f http://data-service:8001/sync/status || exit 1
                '''
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "Data Service deployed successfully"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "Data Service deploy FAILED"
            )
        }
        always {
            cleanWs()
        }
    }
}

Triggers:

  • Push a main branch
  • Manual trigger

Duration: ~8-12 minutos

Deploy Strategy: Rolling update (servicio puede tolerar breve downtime)


3.5 Database Migration Pipeline

Jenkinsfile: apps/database/Jenkinsfile

pipeline {
    agent any

    environment {
        DATABASE_URL = credentials('production-database-url')
        BACKUP_BUCKET = 's3://trading-backups/db'
    }

    stages {
        stage('Checkout') {
            steps {
                git branch: 'main',
                    url: 'https://github.com/trading/trading-platform.git'
            }
        }

        stage('Pre-Migration Backup') {
            steps {
                sh '''
                    # Backup completo antes de migrar
                    timestamp=$(date +%Y%m%d_%H%M%S)
                    pg_dump $DATABASE_URL | gzip > backup_$timestamp.sql.gz

                    # Subir a S3
                    aws s3 cp backup_$timestamp.sql.gz $BACKUP_BUCKET/

                    echo "Backup saved: $BACKUP_BUCKET/backup_$timestamp.sql.gz"
                '''
            }
        }

        stage('Validate Migrations') {
            steps {
                dir('apps/database/migrations') {
                    sh '''
                        # Validar sintaxis SQL
                        for file in *.sql; do
                            psql $DATABASE_URL -f $file --dry-run
                        done
                    '''
                }
            }
        }

        stage('Run Migrations') {
            steps {
                input message: 'Approve migration to production?', ok: 'Deploy'

                dir('apps/database/migrations') {
                    sh '''
                        # Aplicar migraciones en orden
                        for file in $(ls -1 *.sql | sort); do
                            echo "Applying $file..."
                            psql $DATABASE_URL -f $file
                        done
                    '''
                }
            }
        }

        stage('Verify Schema') {
            steps {
                sh '''
                    # Verificar que tablas existen
                    psql $DATABASE_URL -c "\\dt"

                    # Verificar que funciones existen
                    psql $DATABASE_URL -c "\\df"
                '''
            }
        }

        stage('Run Smoke Tests') {
            steps {
                sh '''
                    # Test basic queries
                    psql $DATABASE_URL -c "SELECT COUNT(*) FROM public.users;"
                    psql $DATABASE_URL -c "SELECT COUNT(*) FROM market_data.ohlcv_5m;"
                '''
            }
        }
    }

    post {
        success {
            slackSend(
                color: 'good',
                message: "Database migration completed successfully"
            )
        }
        failure {
            slackSend(
                color: 'danger',
                message: "Database migration FAILED - Manual intervention required"
            )
            input message: 'Restore from backup?', ok: 'Restore'

            sh '''
                # Restaurar desde ultimo backup
                latest_backup=$(aws s3 ls $BACKUP_BUCKET/ | sort | tail -n 1 | awk '{print $4}')
                aws s3 cp $BACKUP_BUCKET/$latest_backup backup.sql.gz
                gunzip backup.sql.gz
                psql $DATABASE_URL < backup.sql
            '''
        }
        always {
            cleanWs()
        }
    }
}

Triggers:

  • Manual (requiere aprobacion explicita)

Duration: ~5-10 minutos (depende de tamano de BD)

Safety: Backup automatico antes de cada migracion


4. Variables de Entorno por Servicio

4.1 Frontend

# Build time (.env.production)
VITE_API_URL=https://api.trading.com
VITE_WS_URL=wss://api.trading.com
VITE_STRIPE_PUBLIC_KEY=pk_live_...
VITE_GOOGLE_CLIENT_ID=...
VITE_FACEBOOK_APP_ID=...

4.2 Backend API

# Runtime (.env.production)
NODE_ENV=production
PORT=3001

# Database
DATABASE_URL=postgresql://user:pass@prod-db.internal:5432/trading
REDIS_URL=redis://prod-redis.internal:6379

# Auth
JWT_SECRET=...
JWT_EXPIRES_IN=15m
REFRESH_TOKEN_EXPIRES_IN=7d

# Stripe
STRIPE_SECRET_KEY=sk_live_...
STRIPE_WEBHOOK_SECRET=whsec_...

# External Services
ML_ENGINE_URL=http://ml-engine:8000
ML_API_KEY=...
CLAUDE_API_KEY=sk-ant-...
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...

# CORS
FRONTEND_URL=https://trading.com

# Monitoring
SENTRY_DSN=...
LOG_LEVEL=info

4.3 ML Engine

# Runtime (.env.production)
DATABASE_URL=postgresql://user:pass@prod-db.internal:5432/trading
REDIS_URL=redis://prod-redis.internal:6379

# Models
MODEL_PATH=/app/models
SUPPORTED_SYMBOLS=BTCUSDT,ETHUSDT,XAUUSD,EURUSD,GBPUSD
DEFAULT_HORIZONS=6,18,36,72

# GPU
CUDA_VISIBLE_DEVICES=0

# API
API_HOST=0.0.0.0
API_PORT=8000
API_KEY=...

4.4 Data Service

# Runtime (.env.production)
DATABASE_URL=postgresql://user:pass@prod-db.internal:5432/trading

# Polygon
POLYGON_API_KEY=...
POLYGON_BASE_URL=https://api.polygon.io
POLYGON_RATE_LIMIT=100

# MetaAPI
METAAPI_TOKEN=...
METAAPI_ACCOUNT_ID=...

# Sync
SYNC_INTERVAL_MINUTES=5
BACKFILL_DAYS=30

5. Puertos de Produccion

Servicio Puerto Interno Puerto Publico Protocolo
Frontend - 443 (HTTPS) HTTPS
Backend API 3001 443 (via Nginx) HTTPS
ML Engine 8000 - (interno) HTTP
Data Service 8001 - (interno) HTTP
PostgreSQL 5432 - (interno) TCP
Redis 6379 - (interno) TCP
Prometheus 9090 - (interno) HTTP
Grafana 3000 - (VPN) HTTP

Nginx Config:

upstream backend {
    server backend-blue:3001 weight=0;
    server backend-green:3001 weight=100;
}

server {
    listen 443 ssl http2;
    server_name api.trading.com;

    ssl_certificate /etc/letsencrypt/live/api.trading.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.trading.com/privkey.pem;

    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

6. Estrategias de Deploy

6.1 Blue/Green (Backend API)

┌─────────────────────────────────────────────────────┐
│                 Load Balancer                        │
│             (100% blue / 0% green)                   │
└───────────┬─────────────────────────────────────────┘
            │
     ┌──────┴──────┐
     ▼             ▼
┌─────────┐   ┌─────────┐
│  BLUE   │   │  GREEN  │
│ Active  │   │  Idle   │
└─────────┘   └─────────┘

1. Deploy to GREEN (0% traffic)
2. Health check GREEN
3. Switch to GREEN (100% traffic)
4. Stop BLUE

Ventajas:

  • Zero downtime
  • Instant rollback
  • Full testing pre-production

Desventajas:

  • Requires 2x resources

6.2 Canary (ML Engine)

┌─────────────────────────────────────────────────────┐
│                 Load Balancer                        │
│              (90% stable / 10% canary)               │
└───────────┬─────────────────────────────────────────┘
            │
     ┌──────┴──────┐
     ▼             ▼
┌─────────┐   ┌─────────┐
│ STABLE  │   │ CANARY  │
│  v1.5   │   │  v1.6   │
└─────────┘   └─────────┘

1. Deploy canary (10% traffic)
2. Monitor metrics 5-10 min
3. If OK: Promote to 100%
4. If ERROR: Rollback

Ventajas:

  • Early error detection
  • Limited blast radius

Desventajas:

  • Slower rollout
  • Complex routing

6.3 Rolling Update (Data Service)

┌──────┐   ┌──────┐   ┌──────┐
│ v1.5 │   │ v1.5 │   │ v1.5 │
└──┬───┘   └──┬───┘   └──┬───┘
   │          │          │
   ▼          │          │
┌──────┐      │          │
│ v1.6 │      │          │  Step 1: Update instance 1
└──────┘      │          │
              ▼          │
           ┌──────┐      │
           │ v1.6 │      │  Step 2: Update instance 2
           └──────┘      │
                         ▼
                      ┌──────┐
                      │ v1.6 │  Step 3: Update instance 3
                      └──────┘

Ventajas:

  • Simple
  • Minimal extra resources

Desventajas:

  • Brief downtime per instance
  • Slower rollout

7. Monitoreo Post-Deploy

7.1 Health Checks

Backend API:

curl https://api.trading.com/health
# Expected:
{
  "status": "ok",
  "database": "connected",
  "redis": "connected",
  "uptime": 12345,
  "version": "1.5.2"
}

ML Engine:

curl http://ml-engine:8000/health
# Expected:
{
  "status": "ok",
  "gpu": true,
  "models_loaded": 5,
  "uptime": 67890
}

7.2 Metricas Clave

Metrica Threshold Alerta
API p95 latency <200ms PagerDuty
API error rate <1% Slack
ML prediction time <2s Email
Database connections <80% Slack
Memory usage <85% Email
CPU usage <75% Email

Prometheus Queries:

# API Error Rate
rate(http_requests_total{status=~"5.."}[5m]) /
rate(http_requests_total[5m]) > 0.01

# ML Prediction Latency
histogram_quantile(0.95, rate(ml_prediction_duration_bucket[5m])) > 2

# Database Connection Pool
pg_stat_activity_count / pg_settings_max_connections > 0.8

7.3 Alertas

Slack Webhook:

slackSend(
    channel: '#deployments',
    color: 'good',
    message: """
        *Deploy Successful* :rocket:
        Service: ${env.IMAGE_NAME}
        Build: #${env.BUILD_NUMBER}
        Commit: ${env.GIT_COMMIT}
        Duration: ${currentBuild.durationString}
    """
)

PagerDuty (Critical):

curl -X POST https://events.pagerduty.com/v2/enqueue \
  -H 'Content-Type: application/json' \
  -d '{
    "routing_key": "...",
    "event_action": "trigger",
    "payload": {
      "summary": "Backend API down",
      "severity": "critical",
      "source": "jenkins"
    }
  }'

8. Rollback Procedures

8.1 Rollback Backend (Blue/Green)

# Desde Jenkins
docker-compose -f docker-compose.prod.yml -p backend-blue up -d
curl -X POST http://prod-lb.internal/switch -d '{"target": "blue"}'

Duration: <30 segundos


8.2 Rollback ML Engine

# Pull imagen anterior
docker pull registry.trading.com/ml-engine:${PREVIOUS_TAG}

# Recreate con imagen anterior
docker-compose -f docker-compose.prod.yml up -d --force-recreate ml-engine

Duration: ~2 minutos


8.3 Rollback Database

# Restaurar desde backup
aws s3 cp s3://trading-backups/db/backup_YYYYMMDD_HHMMSS.sql.gz .
gunzip backup.sql.gz
psql $DATABASE_URL < backup.sql

Duration: ~10-30 minutos (depende de tamano)


9. Secrets Management

9.1 Jenkins Credentials

# Crear credential
jenkins-cli create-credentials-by-xml system::system::jenkins \
    < credentials/stripe-secret-key.xml

credentials/stripe-secret-key.xml:

<com.cloudbees.plugins.credentials.impl.UsernamePasswordCredentialsImpl>
  <scope>GLOBAL</scope>
  <id>stripe-secret-key</id>
  <description>Stripe Secret Key</description>
  <username>stripe</username>
  <password>sk_live_...</password>
</com.cloudbees.plugins.credentials.impl.UsernamePasswordCredentialsImpl>

9.2 Rotation Schedule

Secret Rotation Metodo
JWT_SECRET Quarterly Manual
Database password Monthly Automated
API keys (external) Yearly Manual
SSL certificates Automated Let's Encrypt

10. Referencias


Version History:

Version Fecha Cambios
1.0.0 2025-12-05 Creacion inicial