ML Engine Updates: - Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records - Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence) - Backtest results: +176.71R profit with aggressive_filter strategy Documentation Consolidation: - Created docs/99-analisis/_MAP.md index with 13 new analysis documents - Consolidated inventories: removed duplicates from orchestration/inventarios/ - Updated ML_INVENTORY.yml with BTCUSD metrics and training results - Added execution reports: FASE11-BTCUSD, correction issues, alignment validation Architecture & Integration: - Updated all module documentation with NEXUS v3.4 frontmatter - Fixed _MAP.md indexes across all folders - Updated orchestration plans and traces Files: 229 changed, 5064 insertions(+), 1872 deletions(-) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
35 KiB
| id | title | type | project | version | updated_date |
|---|---|---|---|---|---|
| JENKINS-DEPLOY | Jenkins CI/CD - Configuracion de Deploy | Documentation | trading-platform | 1.0.0 | 2026-01-04 |
Jenkins CI/CD - Configuracion de Deploy
Version: 1.0.0 Fecha: 2025-12-05 Autor: Agente de Documentacion y Planificacion Estado: Aprobado
1. Resumen Ejecutivo
Este documento define la estrategia completa de CI/CD para Trading Platform usando Jenkins, incluyendo:
- Pipelines por cada servicio
- Variables de entorno necesarias
- Puertos de produccion asignados
- Estrategia de deploy (blue-green, canary, rolling)
- Monitoreo post-deploy
2. Arquitectura de Deploy
┌──────────────────────────────────────────────────────────────────────────┐
│ JENKINS SERVER │
│ jenkins.trading.com │
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ PIPELINES │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────┐ │ │
│ │ │ Frontend │ │ Backend │ │ML Engine │ │ Data │ │ DB │ │ │
│ │ │ Build │ │ Build │ │ Deploy │ │ Service │ │ Mig │ │ │
│ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └──┬───┘ │ │
│ └───────┼─────────────┼─────────────┼─────────────┼────────────┼─────┘ │
│ │ │ │ │ │ │
└──────────┼─────────────┼─────────────┼─────────────┼────────────┼─────────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ PRODUCTION ENVIRONMENT │
│ prod.trading.com │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌─────┐│
│ │ Cloudflare │ │ Docker │ │ Docker │ │ Docker │ │ PG ││
│ │ Pages │ │ Backend │ │ ML Engine │ │ Data │ │ 15 ││
│ │ │ │ :3001 │ │ :8000 │ │ :8001 │ │:5432││
│ │ Frontend │ │ │ │ │ │ │ │ ││
│ │ Assets │ │ Node.js │ │ Python │ │ Python │ │Redis││
│ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │:6379││
│ └─────┘│
│ │
│ Load Balancer: Nginx (ports 80/443 → 3001) │
│ SSL/TLS: Let's Encrypt (auto-renew) │
│ Monitoring: Prometheus + Grafana │
└──────────────────────────────────────────────────────────────────────────┘
3. Pipelines por Servicio
3.1 Frontend Pipeline (Cloudflare Pages)
Jenkinsfile: apps/frontend/Jenkinsfile
pipeline {
agent any
environment {
CLOUDFLARE_API_TOKEN = credentials('cloudflare-api-token')
PROJECT_NAME = 'trading-frontend'
VITE_API_URL = 'https://api.trading.com'
VITE_WS_URL = 'wss://api.trading.com'
VITE_STRIPE_PUBLIC_KEY = credentials('stripe-public-key')
}
stages {
stage('Checkout') {
steps {
git branch: 'main',
url: 'https://github.com/trading/trading-platform.git'
}
}
stage('Install Dependencies') {
steps {
dir('apps/frontend') {
sh 'npm ci'
}
}
}
stage('Lint') {
steps {
dir('apps/frontend') {
sh 'npm run lint'
}
}
}
stage('Type Check') {
steps {
dir('apps/frontend') {
sh 'npm run type-check'
}
}
}
stage('Unit Tests') {
steps {
dir('apps/frontend') {
sh 'npm run test:ci'
}
}
}
stage('Build') {
steps {
dir('apps/frontend') {
sh 'npm run build'
}
}
}
stage('E2E Tests (Staging)') {
when {
branch 'main'
}
steps {
dir('apps/frontend') {
sh 'npm run test:e2e:ci'
}
}
}
stage('Deploy to Cloudflare') {
when {
branch 'main'
}
steps {
dir('apps/frontend') {
sh '''
npx wrangler pages deploy dist \
--project-name=$PROJECT_NAME \
--branch=production
'''
}
}
}
stage('Smoke Tests') {
when {
branch 'main'
}
steps {
sh '''
curl -f https://trading.com/health || exit 1
'''
}
}
}
post {
success {
slackSend(
color: 'good',
message: "Frontend deployed successfully: ${env.BUILD_URL}"
)
}
failure {
slackSend(
color: 'danger',
message: "Frontend deploy FAILED: ${env.BUILD_URL}"
)
}
always {
cleanWs()
}
}
}
Triggers:
- Push a
mainbranch - Manual trigger
- Cron: Nightly build
H 2 * * *
Duration: ~5-8 minutos
3.2 Backend API Pipeline (Docker)
Jenkinsfile: apps/backend/Jenkinsfile
pipeline {
agent any
environment {
DOCKER_REGISTRY = 'registry.trading.com'
IMAGE_NAME = 'backend-api'
IMAGE_TAG = "${env.BUILD_NUMBER}"
DATABASE_URL = credentials('production-database-url')
REDIS_URL = credentials('production-redis-url')
JWT_SECRET = credentials('jwt-secret')
STRIPE_SECRET_KEY = credentials('stripe-secret-key')
STRIPE_WEBHOOK_SECRET = credentials('stripe-webhook-secret')
ML_ENGINE_URL = 'http://ml-engine:8000'
CLAUDE_API_KEY = credentials('claude-api-key')
}
stages {
stage('Checkout') {
steps {
git branch: 'main',
url: 'https://github.com/trading/trading-platform.git'
}
}
stage('Install Dependencies') {
steps {
dir('apps/backend') {
sh 'npm ci'
}
}
}
stage('Lint') {
steps {
dir('apps/backend') {
sh 'npm run lint'
}
}
}
stage('Type Check') {
steps {
dir('apps/backend') {
sh 'npm run type-check'
}
}
}
stage('Unit Tests') {
steps {
dir('apps/backend') {
sh 'npm run test:ci'
}
}
}
stage('Build TypeScript') {
steps {
dir('apps/backend') {
sh 'npm run build'
}
}
}
stage('Integration Tests') {
steps {
dir('apps/backend') {
sh '''
docker-compose -f docker-compose.test.yml up -d
npm run test:integration
docker-compose -f docker-compose.test.yml down
'''
}
}
}
stage('Build Docker Image') {
steps {
dir('apps/backend') {
sh '''
docker build \
-t $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
-t $DOCKER_REGISTRY/$IMAGE_NAME:latest \
.
'''
}
}
}
stage('Security Scan') {
steps {
sh '''
docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image \
$DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
'''
}
}
stage('Push to Registry') {
steps {
sh '''
docker push $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
docker push $DOCKER_REGISTRY/$IMAGE_NAME:latest
'''
}
}
stage('Deploy - Blue/Green') {
when {
branch 'main'
}
steps {
script {
// Determinar color actual
def currentColor = sh(
script: '''
curl -s http://prod-lb.internal/health | \
jq -r '.color' || echo 'blue'
''',
returnStdout: true
).trim()
def newColor = currentColor == 'blue' ? 'green' : 'blue'
echo "Current: $currentColor, Deploying: $newColor"
// Deploy a nuevo color
sh """
docker-compose -f docker-compose.prod.yml \
-p backend-$newColor \
up -d --force-recreate
"""
// Health check
sh """
for i in {1..30}; do
if curl -f http://backend-$newColor:3001/health; then
echo "Backend $newColor is healthy"
break
fi
echo "Waiting for backend $newColor... ($i/30)"
sleep 2
done
"""
// Switch load balancer
sh """
curl -X POST http://prod-lb.internal/switch \
-d '{"target": "$newColor"}'
"""
// Wait for traffic drain
sleep 10
// Stop old version
sh """
docker-compose -f docker-compose.prod.yml \
-p backend-$currentColor \
down
"""
}
}
}
stage('Smoke Tests') {
when {
branch 'main'
}
steps {
sh '''
curl -f https://api.trading.com/health || exit 1
curl -f https://api.trading.com/health/detailed || exit 1
'''
}
}
}
post {
success {
slackSend(
color: 'good',
message: "Backend API deployed successfully: Build #${env.BUILD_NUMBER}"
)
}
failure {
slackSend(
color: 'danger',
message: "Backend API deploy FAILED: ${env.BUILD_URL}"
)
// Rollback automático
script {
sh '''
docker-compose -f docker-compose.prod.yml \
-p backend-blue \
up -d --force-recreate
'''
}
}
always {
junit 'apps/backend/coverage/junit.xml'
publishHTML([
reportDir: 'apps/backend/coverage',
reportFiles: 'index.html',
reportName: 'Coverage Report'
])
cleanWs()
}
}
}
Triggers:
- Push a
mainbranch - Manual trigger
Duration: ~10-15 minutos
Rollback Strategy: Blue/Green deployment con rollback automatico en fallo
3.3 ML Engine Pipeline (Docker GPU)
Jenkinsfile: apps/ml-engine/Jenkinsfile
pipeline {
agent {
label 'gpu-node' // Nodo con GPU
}
environment {
DOCKER_REGISTRY = 'registry.trading.com'
IMAGE_NAME = 'ml-engine'
IMAGE_TAG = "${env.BUILD_NUMBER}"
DATABASE_URL = credentials('production-database-url')
REDIS_URL = credentials('production-redis-url')
MODEL_PATH = '/app/models'
}
stages {
stage('Checkout') {
steps {
git branch: 'main',
url: 'https://github.com/trading/trading-platform.git'
}
}
stage('Validate Models') {
steps {
dir('apps/ml-engine') {
sh '''
# Verificar que modelos existen
test -f models/phase2/range_predictor/15m/model_high.json
test -f models/phase2/range_predictor/15m/model_low.json
test -f models/phase2/tpsl_classifier/15m_rr_2_1.json
'''
}
}
}
stage('Unit Tests') {
steps {
dir('apps/ml-engine') {
sh '''
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pytest tests/ -v --cov=app
'''
}
}
}
stage('Build Docker Image') {
steps {
dir('apps/ml-engine') {
sh '''
docker build \
-t $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
-t $DOCKER_REGISTRY/$IMAGE_NAME:latest \
-f Dockerfile.gpu \
.
'''
}
}
}
stage('Test Prediction') {
steps {
sh '''
docker run --rm --gpus all \
-e DATABASE_URL=$DATABASE_URL \
$DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
python -m app.test_prediction
'''
}
}
stage('Push to Registry') {
steps {
sh '''
docker push $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
docker push $DOCKER_REGISTRY/$IMAGE_NAME:latest
'''
}
}
stage('Deploy - Canary') {
when {
branch 'main'
}
steps {
script {
// Deploy canary (10% traffic)
sh '''
docker-compose -f docker-compose.prod.yml \
-p ml-engine-canary \
up -d --force-recreate
'''
// Monitor metrics por 5 minutos
echo "Monitoring canary metrics..."
sleep 300
def errorRate = sh(
script: '''
curl -s http://prometheus:9090/api/v1/query \
--data 'query=rate(ml_prediction_errors[5m])' | \
jq -r '.data.result[0].value[1]'
''',
returnStdout: true
).trim().toFloat()
if (errorRate > 0.05) {
error("Canary error rate too high: ${errorRate}")
}
// Promote to 100%
sh '''
docker-compose -f docker-compose.prod.yml \
-p ml-engine \
up -d --force-recreate
# Stop canary
docker-compose -f docker-compose.prod.yml \
-p ml-engine-canary \
down
'''
}
}
}
stage('Smoke Tests') {
when {
branch 'main'
}
steps {
sh '''
curl -f http://ml-engine:8000/health || exit 1
curl -X POST http://ml-engine:8000/predictions \
-H "X-API-Key: $ML_API_KEY" \
-d '{"symbol": "BTCUSDT", "horizon": 18}' || exit 1
'''
}
}
}
post {
success {
slackSend(
color: 'good',
message: "ML Engine deployed successfully"
)
}
failure {
slackSend(
color: 'danger',
message: "ML Engine deploy FAILED - Rolling back"
)
// Rollback
sh '''
docker-compose -f docker-compose.prod.yml \
-p ml-engine-canary \
down
'''
}
always {
cleanWs()
}
}
}
Triggers:
- Manual (requiere aprobacion)
- Scheduled: Weekly retrain
H 0 * * 0
Duration: ~20-30 minutos
Deploy Strategy: Canary (10% traffic → 100% si metricas OK)
3.4 Data Service Pipeline
Jenkinsfile: apps/data-service/Jenkinsfile
pipeline {
agent any
environment {
DOCKER_REGISTRY = 'registry.trading.com'
IMAGE_NAME = 'data-service'
IMAGE_TAG = "${env.BUILD_NUMBER}"
DATABASE_URL = credentials('production-database-url')
POLYGON_API_KEY = credentials('polygon-api-key')
METAAPI_TOKEN = credentials('metaapi-token')
}
stages {
stage('Checkout') {
steps {
git branch: 'main',
url: 'https://github.com/trading/trading-platform.git'
}
}
stage('Unit Tests') {
steps {
dir('apps/data-service') {
sh '''
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pytest tests/ -v
'''
}
}
}
stage('Build Docker Image') {
steps {
dir('apps/data-service') {
sh '''
docker build \
-t $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG \
-t $DOCKER_REGISTRY/$IMAGE_NAME:latest \
.
'''
}
}
}
stage('Push to Registry') {
steps {
sh '''
docker push $DOCKER_REGISTRY/$IMAGE_NAME:$IMAGE_TAG
docker push $DOCKER_REGISTRY/$IMAGE_NAME:latest
'''
}
}
stage('Deploy - Rolling Update') {
when {
branch 'main'
}
steps {
sh '''
docker-compose -f docker-compose.prod.yml \
up -d --no-deps --force-recreate data-service
'''
}
}
stage('Verify Sync') {
when {
branch 'main'
}
steps {
sh '''
sleep 60
curl -f http://data-service:8001/sync/status || exit 1
'''
}
}
}
post {
success {
slackSend(
color: 'good',
message: "Data Service deployed successfully"
)
}
failure {
slackSend(
color: 'danger',
message: "Data Service deploy FAILED"
)
}
always {
cleanWs()
}
}
}
Triggers:
- Push a
mainbranch - Manual trigger
Duration: ~8-12 minutos
Deploy Strategy: Rolling update (servicio puede tolerar breve downtime)
3.5 Database Migration Pipeline
Jenkinsfile: apps/database/Jenkinsfile
pipeline {
agent any
environment {
DATABASE_URL = credentials('production-database-url')
BACKUP_BUCKET = 's3://trading-backups/db'
}
stages {
stage('Checkout') {
steps {
git branch: 'main',
url: 'https://github.com/trading/trading-platform.git'
}
}
stage('Pre-Migration Backup') {
steps {
sh '''
# Backup completo antes de migrar
timestamp=$(date +%Y%m%d_%H%M%S)
pg_dump $DATABASE_URL | gzip > backup_$timestamp.sql.gz
# Subir a S3
aws s3 cp backup_$timestamp.sql.gz $BACKUP_BUCKET/
echo "Backup saved: $BACKUP_BUCKET/backup_$timestamp.sql.gz"
'''
}
}
stage('Validate Migrations') {
steps {
dir('apps/database/migrations') {
sh '''
# Validar sintaxis SQL
for file in *.sql; do
psql $DATABASE_URL -f $file --dry-run
done
'''
}
}
}
stage('Run Migrations') {
steps {
input message: 'Approve migration to production?', ok: 'Deploy'
dir('apps/database/migrations') {
sh '''
# Aplicar migraciones en orden
for file in $(ls -1 *.sql | sort); do
echo "Applying $file..."
psql $DATABASE_URL -f $file
done
'''
}
}
}
stage('Verify Schema') {
steps {
sh '''
# Verificar que tablas existen
psql $DATABASE_URL -c "\\dt"
# Verificar que funciones existen
psql $DATABASE_URL -c "\\df"
'''
}
}
stage('Run Smoke Tests') {
steps {
sh '''
# Test basic queries
psql $DATABASE_URL -c "SELECT COUNT(*) FROM public.users;"
psql $DATABASE_URL -c "SELECT COUNT(*) FROM market_data.ohlcv_5m;"
'''
}
}
}
post {
success {
slackSend(
color: 'good',
message: "Database migration completed successfully"
)
}
failure {
slackSend(
color: 'danger',
message: "Database migration FAILED - Manual intervention required"
)
input message: 'Restore from backup?', ok: 'Restore'
sh '''
# Restaurar desde ultimo backup
latest_backup=$(aws s3 ls $BACKUP_BUCKET/ | sort | tail -n 1 | awk '{print $4}')
aws s3 cp $BACKUP_BUCKET/$latest_backup backup.sql.gz
gunzip backup.sql.gz
psql $DATABASE_URL < backup.sql
'''
}
always {
cleanWs()
}
}
}
Triggers:
- Manual (requiere aprobacion explicita)
Duration: ~5-10 minutos (depende de tamano de BD)
Safety: Backup automatico antes de cada migracion
4. Variables de Entorno por Servicio
4.1 Frontend
# Build time (.env.production)
VITE_API_URL=https://api.trading.com
VITE_WS_URL=wss://api.trading.com
VITE_STRIPE_PUBLIC_KEY=pk_live_...
VITE_GOOGLE_CLIENT_ID=...
VITE_FACEBOOK_APP_ID=...
4.2 Backend API
# Runtime (.env.production)
NODE_ENV=production
PORT=3001
# Database
DATABASE_URL=postgresql://user:pass@prod-db.internal:5432/trading
REDIS_URL=redis://prod-redis.internal:6379
# Auth
JWT_SECRET=...
JWT_EXPIRES_IN=15m
REFRESH_TOKEN_EXPIRES_IN=7d
# Stripe
STRIPE_SECRET_KEY=sk_live_...
STRIPE_WEBHOOK_SECRET=whsec_...
# External Services
ML_ENGINE_URL=http://ml-engine:8000
ML_API_KEY=...
CLAUDE_API_KEY=sk-ant-...
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
# CORS
FRONTEND_URL=https://trading.com
# Monitoring
SENTRY_DSN=...
LOG_LEVEL=info
4.3 ML Engine
# Runtime (.env.production)
DATABASE_URL=postgresql://user:pass@prod-db.internal:5432/trading
REDIS_URL=redis://prod-redis.internal:6379
# Models
MODEL_PATH=/app/models
SUPPORTED_SYMBOLS=BTCUSDT,ETHUSDT,XAUUSD,EURUSD,GBPUSD
DEFAULT_HORIZONS=6,18,36,72
# GPU
CUDA_VISIBLE_DEVICES=0
# API
API_HOST=0.0.0.0
API_PORT=8000
API_KEY=...
4.4 Data Service
# Runtime (.env.production)
DATABASE_URL=postgresql://user:pass@prod-db.internal:5432/trading
# Polygon
POLYGON_API_KEY=...
POLYGON_BASE_URL=https://api.polygon.io
POLYGON_RATE_LIMIT=100
# MetaAPI
METAAPI_TOKEN=...
METAAPI_ACCOUNT_ID=...
# Sync
SYNC_INTERVAL_MINUTES=5
BACKFILL_DAYS=30
5. Puertos de Produccion
| Servicio | Puerto Interno | Puerto Publico | Protocolo |
|---|---|---|---|
| Frontend | - | 443 (HTTPS) | HTTPS |
| Backend API | 3001 | 443 (via Nginx) | HTTPS |
| ML Engine | 8000 | - (interno) | HTTP |
| Data Service | 8001 | - (interno) | HTTP |
| PostgreSQL | 5432 | - (interno) | TCP |
| Redis | 6379 | - (interno) | TCP |
| Prometheus | 9090 | - (interno) | HTTP |
| Grafana | 3000 | - (VPN) | HTTP |
Nginx Config:
upstream backend {
server backend-blue:3001 weight=0;
server backend-green:3001 weight=100;
}
server {
listen 443 ssl http2;
server_name api.trading.com;
ssl_certificate /etc/letsencrypt/live/api.trading.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/api.trading.com/privkey.pem;
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
6. Estrategias de Deploy
6.1 Blue/Green (Backend API)
┌─────────────────────────────────────────────────────┐
│ Load Balancer │
│ (100% blue / 0% green) │
└───────────┬─────────────────────────────────────────┘
│
┌──────┴──────┐
▼ ▼
┌─────────┐ ┌─────────┐
│ BLUE │ │ GREEN │
│ Active │ │ Idle │
└─────────┘ └─────────┘
1. Deploy to GREEN (0% traffic)
2. Health check GREEN
3. Switch to GREEN (100% traffic)
4. Stop BLUE
Ventajas:
- Zero downtime
- Instant rollback
- Full testing pre-production
Desventajas:
- Requires 2x resources
6.2 Canary (ML Engine)
┌─────────────────────────────────────────────────────┐
│ Load Balancer │
│ (90% stable / 10% canary) │
└───────────┬─────────────────────────────────────────┘
│
┌──────┴──────┐
▼ ▼
┌─────────┐ ┌─────────┐
│ STABLE │ │ CANARY │
│ v1.5 │ │ v1.6 │
└─────────┘ └─────────┘
1. Deploy canary (10% traffic)
2. Monitor metrics 5-10 min
3. If OK: Promote to 100%
4. If ERROR: Rollback
Ventajas:
- Early error detection
- Limited blast radius
Desventajas:
- Slower rollout
- Complex routing
6.3 Rolling Update (Data Service)
┌──────┐ ┌──────┐ ┌──────┐
│ v1.5 │ │ v1.5 │ │ v1.5 │
└──┬───┘ └──┬───┘ └──┬───┘
│ │ │
▼ │ │
┌──────┐ │ │
│ v1.6 │ │ │ Step 1: Update instance 1
└──────┘ │ │
▼ │
┌──────┐ │
│ v1.6 │ │ Step 2: Update instance 2
└──────┘ │
▼
┌──────┐
│ v1.6 │ Step 3: Update instance 3
└──────┘
Ventajas:
- Simple
- Minimal extra resources
Desventajas:
- Brief downtime per instance
- Slower rollout
7. Monitoreo Post-Deploy
7.1 Health Checks
Backend API:
curl https://api.trading.com/health
# Expected:
{
"status": "ok",
"database": "connected",
"redis": "connected",
"uptime": 12345,
"version": "1.5.2"
}
ML Engine:
curl http://ml-engine:8000/health
# Expected:
{
"status": "ok",
"gpu": true,
"models_loaded": 5,
"uptime": 67890
}
7.2 Metricas Clave
| Metrica | Threshold | Alerta |
|---|---|---|
| API p95 latency | <200ms | PagerDuty |
| API error rate | <1% | Slack |
| ML prediction time | <2s | |
| Database connections | <80% | Slack |
| Memory usage | <85% | |
| CPU usage | <75% |
Prometheus Queries:
# API Error Rate
rate(http_requests_total{status=~"5.."}[5m]) /
rate(http_requests_total[5m]) > 0.01
# ML Prediction Latency
histogram_quantile(0.95, rate(ml_prediction_duration_bucket[5m])) > 2
# Database Connection Pool
pg_stat_activity_count / pg_settings_max_connections > 0.8
7.3 Alertas
Slack Webhook:
slackSend(
channel: '#deployments',
color: 'good',
message: """
*Deploy Successful* :rocket:
Service: ${env.IMAGE_NAME}
Build: #${env.BUILD_NUMBER}
Commit: ${env.GIT_COMMIT}
Duration: ${currentBuild.durationString}
"""
)
PagerDuty (Critical):
curl -X POST https://events.pagerduty.com/v2/enqueue \
-H 'Content-Type: application/json' \
-d '{
"routing_key": "...",
"event_action": "trigger",
"payload": {
"summary": "Backend API down",
"severity": "critical",
"source": "jenkins"
}
}'
8. Rollback Procedures
8.1 Rollback Backend (Blue/Green)
# Desde Jenkins
docker-compose -f docker-compose.prod.yml -p backend-blue up -d
curl -X POST http://prod-lb.internal/switch -d '{"target": "blue"}'
Duration: <30 segundos
8.2 Rollback ML Engine
# Pull imagen anterior
docker pull registry.trading.com/ml-engine:${PREVIOUS_TAG}
# Recreate con imagen anterior
docker-compose -f docker-compose.prod.yml up -d --force-recreate ml-engine
Duration: ~2 minutos
8.3 Rollback Database
# Restaurar desde backup
aws s3 cp s3://trading-backups/db/backup_YYYYMMDD_HHMMSS.sql.gz .
gunzip backup.sql.gz
psql $DATABASE_URL < backup.sql
Duration: ~10-30 minutos (depende de tamano)
9. Secrets Management
9.1 Jenkins Credentials
# Crear credential
jenkins-cli create-credentials-by-xml system::system::jenkins \
< credentials/stripe-secret-key.xml
credentials/stripe-secret-key.xml:
<com.cloudbees.plugins.credentials.impl.UsernamePasswordCredentialsImpl>
<scope>GLOBAL</scope>
<id>stripe-secret-key</id>
<description>Stripe Secret Key</description>
<username>stripe</username>
<password>sk_live_...</password>
</com.cloudbees.plugins.credentials.impl.UsernamePasswordCredentialsImpl>
9.2 Rotation Schedule
| Secret | Rotation | Metodo |
|---|---|---|
| JWT_SECRET | Quarterly | Manual |
| Database password | Monthly | Automated |
| API keys (external) | Yearly | Manual |
| SSL certificates | Automated | Let's Encrypt |
10. Referencias
- PLAN-DESARROLLO-DETALLADO.md
- DIAGRAMA-INTEGRACIONES.md
- MATRIZ-DEPENDENCIAS.yml
- Jenkins Documentation
- Docker Best Practices
Version History:
| Version | Fecha | Cambios |
|---|---|---|
| 1.0.0 | 2025-12-05 | Creacion inicial |