ML Engine Updates: - Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records - Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence) - Backtest results: +176.71R profit with aggressive_filter strategy Documentation Consolidation: - Created docs/99-analisis/_MAP.md index with 13 new analysis documents - Consolidated inventories: removed duplicates from orchestration/inventarios/ - Updated ML_INVENTORY.yml with BTCUSD metrics and training results - Added execution reports: FASE11-BTCUSD, correction issues, alignment validation Architecture & Integration: - Updated all module documentation with NEXUS v3.4 frontmatter - Fixed _MAP.md indexes across all folders - Updated orchestration plans and traces Files: 229 changed, 5064 insertions(+), 1872 deletions(-) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
9.1 KiB
9.1 KiB
| id | title | type | project | version | updated_date |
|---|---|---|---|---|---|
| ADR-006-caching | Estrategia de Caching | Documentation | trading-platform | 1.0.0 | 2026-01-04 |
ADR-005: Estrategia de Caching
Estado: Aceptado Fecha: 2025-12-06 Decisores: Tech Lead, Arquitecto Relacionado: ADR-001
Contexto
Trading Platform necesita optimizar performance en varios aspectos:
- Predicciones ML: Modelos tardan 200-500ms, usuarios esperan < 100ms
- Market Data: APIs de trading limitan a 100 requests/min
- Sesiones de Usuario: JWT validation en cada request es costosa
- Rate Limiting: Prevenir abuse de API (DDoS, brute force)
- Leaderboards: Queries complejas que agregan datos de muchos usuarios
- Historical Data: Datos históricos raramente cambian
Requisitos de Performance:
- API responses < 200ms (p95)
- ML predictions < 100ms (cached)
- Soporte para 10K usuarios concurrentes (post-MVP)
Decisión
Cache Layer: Redis 7.x
Redis como cache principal por:
- In-memory: latencia < 1ms
- Pub/Sub: Para invalidación de cache distribuida
- Data structures: Lists, Sets, Sorted Sets para leaderboards
- TTL nativo: Expiración automática
- Persistence opcional: RDB snapshots para disaster recovery
Cache Strategy por Tipo de Dato
| Data Type | Strategy | TTL | Invalidation |
|---|---|---|---|
| Sessions | Write-through | 7 days | Logout, password change |
| ML Predictions | Cache-aside | 5 min | Model retrain |
| Market Data | Cache-aside | 1 min | Webhook from broker API |
| User Profile | Write-through | 1 hour | Profile update |
| Leaderboards | Cache-aside | 10 min | Cron job rebuild |
| Historical OHLCV | Cache-aside | 24 hours | Never (immutable) |
| Rate Limit | Counter | 1 min | Auto-expire |
TTL Configuration
// apps/backend/src/config/cache.ts
export const CACHE_TTL = {
SESSION: 60 * 60 * 24 * 7, // 7 days
ML_PREDICTION: 60 * 5, // 5 minutes
MARKET_DATA: 60, // 1 minute
USER_PROFILE: 60 * 60, // 1 hour
LEADERBOARD: 60 * 10, // 10 minutes
OHLCV_HISTORICAL: 60 * 60 * 24, // 24 hours
RATE_LIMIT: 60, // 1 minute
} as const;
Redis Key Naming Convention
{app}:{entity}:{id}:{version}
Examples:
- session:user:123abc:v1
- ml:prediction:AAPL:1d:v2
- market:ohlcv:TSLA:2025-12-06:v1
- user:profile:456def:v1
- leaderboard:monthly:2025-12:v1
- ratelimit:api:192.168.1.1:v1
Cache-Aside Pattern (Read-Heavy)
// apps/backend/src/services/market.service.ts
async function getOHLCV(symbol: string, date: string) {
const cacheKey = `market:ohlcv:${symbol}:${date}:v1`;
// 1. Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// 2. Cache miss → fetch from DB
const data = await db.ohlcv.findUnique({
where: { symbol, date }
});
// 3. Store in cache
await redis.setex(
cacheKey,
CACHE_TTL.OHLCV_HISTORICAL,
JSON.stringify(data)
);
return data;
}
Write-Through Pattern (Write-Heavy)
// apps/backend/src/services/user.service.ts
async function updateProfile(userId: string, data: ProfileUpdate) {
const cacheKey = `user:profile:${userId}:v1`;
// 1. Write to DB first
const updated = await db.user.update({
where: { id: userId },
data
});
// 2. Update cache
await redis.setex(
cacheKey,
CACHE_TTL.USER_PROFILE,
JSON.stringify(updated)
);
return updated;
}
Rate Limiting
// apps/backend/src/middleware/rateLimit.ts
async function rateLimit(req: Request, res: Response, next: NextFunction) {
const ip = req.ip;
const key = `ratelimit:api:${ip}:v1`;
const current = await redis.incr(key);
if (current === 1) {
await redis.expire(key, CACHE_TTL.RATE_LIMIT);
}
if (current > 100) { // 100 requests per minute
return res.status(429).json({
error: 'Too many requests, try again later'
});
}
next();
}
ML Predictions Cache
# apps/ml-engine/cache.py
import redis
import json
from datetime import timedelta
redis_client = redis.Redis(host='localhost', port=6379)
async def get_prediction(symbol: str, timeframe: str):
cache_key = f"ml:prediction:{symbol}:{timeframe}:v2"
# Try cache
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Generate prediction (expensive)
prediction = await model.predict(symbol, timeframe)
# Cache for 5 minutes
redis_client.setex(
cache_key,
timedelta(minutes=5),
json.dumps(prediction)
)
return prediction
Cache Invalidation
// apps/backend/src/services/cache.service.ts
class CacheInvalidator {
// Invalidate single key
async invalidate(key: string) {
await redis.del(key);
}
// Invalidate pattern (use with caution)
async invalidatePattern(pattern: string) {
const keys = await redis.keys(pattern);
if (keys.length > 0) {
await redis.del(...keys);
}
}
// Invalidate on model retrain
async invalidateMLPredictions() {
await this.invalidatePattern('ml:prediction:*');
}
// Invalidate user session
async invalidateSession(userId: string) {
await this.invalidatePattern(`session:user:${userId}:*`);
}
}
Consecuencias
Positivas
- Performance: API responses < 100ms (cached), vs 500ms (uncached)
- Cost Savings: 90% menos requests a APIs externas de market data
- Scalability: Redis soporta 100K ops/sec en hardware modesto
- Rate Limiting: Previene abuse sin afectar usuarios legítimos
- ML Efficiency: Predicciones cached reducen load en GPU
- Developer Experience: Redis CLI para debugging de cache
- Session Management: Logout instantáneo vía invalidación
Negativas
- Stale Data: Cache puede servir datos desactualizados por max TTL
- Memory Cost: Redis requiere ~2GB RAM para 10K usuarios activos
- Complexity: Cache invalidation es "one of the two hard problems"
- Cold Start: Primeros requests son lentos (cache warming needed)
- Thundering Herd: Múltiples requests simultáneos al expirar cache
- Redis SPOF: Si Redis cae, performance degrada (no falla, pero lento)
Riesgos y Mitigaciones
| Riesgo | Mitigación |
|---|---|
| Cache invalidation bugs | Unit tests + integration tests de cache |
| Thundering herd | Implement cache locking (SETNX pattern) |
| Redis down | Graceful degradation: app funciona sin cache |
| Memory exhausted | Eviction policy: allkeys-lru |
| Stale ML predictions | Short TTL (5 min) + manual invalidation |
Alternativas Consideradas
1. Memcached
- Pros: Simple, muy rápido, menos memoria que Redis
- Contras: No persistence, no data structures, no pub/sub
- Decisión: ❌ Descartada - Redis ofrece más features por mismo costo
2. Application-Level Cache (Node.js Map)
- Pros: Cero latencia, no external dependency
- Contras: No distribuido, se pierde al reiniciar app
- Decisión: ❌ Descartada - No escala a múltiples instancias
3. CDN Caching (Cloudflare/CloudFront)
- Pros: Global distribution, DDoS protection
- Contras: Solo para static assets, no funciona para API
- Decisión: ⚠️ Complementario - Usar para frontend, no para API
4. Database Query Cache (PostgreSQL)
- Pros: Ya incluido en Postgres
- Contras: Limitado, no configurable por query
- Decisión: ❌ Insuficiente - Necesitamos cache más agresivo
5. GraphQL + DataLoader
- Pros: Batching automático, cache per-request
- Contras: Requiere migrar de REST a GraphQL
- Decisión: ❌ Descartada - REST es suficiente para MVP
6. No Cache (Optimizar DB Queries)
- Pros: Menos complejidad, no stale data
- Contras: Imposible alcanzar < 200ms para ML predictions
- Decisión: ❌ Descartada - Performance requirements no alcanzables
Cache Warming Strategy
// apps/backend/src/jobs/cache-warmer.ts
import cron from 'node-cron';
// Run every 5 minutes
cron.schedule('*/5 * * * *', async () => {
const topSymbols = ['AAPL', 'TSLA', 'GOOGL', 'MSFT', 'AMZN'];
for (const symbol of topSymbols) {
// Pre-cache ML predictions for popular symbols
await mlService.getPrediction(symbol, '1d');
await mlService.getPrediction(symbol, '1w');
// Pre-cache market data
await marketService.getOHLCV(symbol, today());
}
// Pre-cache leaderboards
await leaderboardService.getMonthly();
});
Monitoring
Redis Metrics to Track
- Hit/Miss Ratio (target > 80%)
- Evicted Keys (should be 0)
- Memory Usage (alert at > 80%)
- Connected Clients
- Commands/sec
Logging
// Log cache hits/misses
logger.info('cache_hit', { key, ttl: remaining });
logger.warn('cache_miss', { key, reason: 'expired' });