trading-platform/docs/97-adr/ADR-006-caching.md
rckrdmrd c1b5081208 feat(ml): Complete FASE 11 - BTCUSD update and comprehensive documentation alignment
ML Engine Updates:
- Updated BTCUSD with Polygon API data (2024-2025): 215,699 new records
- Re-trained all ML models: Attention (R²: 0.223), Base, Metamodel (87.3% confidence)
- Backtest results: +176.71R profit with aggressive_filter strategy

Documentation Consolidation:
- Created docs/99-analisis/_MAP.md index with 13 new analysis documents
- Consolidated inventories: removed duplicates from orchestration/inventarios/
- Updated ML_INVENTORY.yml with BTCUSD metrics and training results
- Added execution reports: FASE11-BTCUSD, correction issues, alignment validation

Architecture & Integration:
- Updated all module documentation with NEXUS v3.4 frontmatter
- Fixed _MAP.md indexes across all folders
- Updated orchestration plans and traces

Files: 229 changed, 5064 insertions(+), 1872 deletions(-)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 09:31:29 -06:00

348 lines
9.1 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
id: "ADR-006-caching"
title: "Estrategia de Caching"
type: "Documentation"
project: "trading-platform"
version: "1.0.0"
updated_date: "2026-01-04"
---
# ADR-005: Estrategia de Caching
**Estado:** Aceptado
**Fecha:** 2025-12-06
**Decisores:** Tech Lead, Arquitecto
**Relacionado:** ADR-001
---
## Contexto
Trading Platform necesita optimizar performance en varios aspectos:
1. **Predicciones ML**: Modelos tardan 200-500ms, usuarios esperan < 100ms
2. **Market Data**: APIs de trading limitan a 100 requests/min
3. **Sesiones de Usuario**: JWT validation en cada request es costosa
4. **Rate Limiting**: Prevenir abuse de API (DDoS, brute force)
5. **Leaderboards**: Queries complejas que agregan datos de muchos usuarios
6. **Historical Data**: Datos históricos raramente cambian
Requisitos de Performance:
- API responses < 200ms (p95)
- ML predictions < 100ms (cached)
- Soporte para 10K usuarios concurrentes (post-MVP)
---
## Decisión
### Cache Layer: Redis 7.x
**Redis** como cache principal por:
- In-memory: latencia < 1ms
- Pub/Sub: Para invalidación de cache distribuida
- Data structures: Lists, Sets, Sorted Sets para leaderboards
- TTL nativo: Expiración automática
- Persistence opcional: RDB snapshots para disaster recovery
### Cache Strategy por Tipo de Dato
| Data Type | Strategy | TTL | Invalidation |
|-----------|----------|-----|--------------|
| **Sessions** | Write-through | 7 days | Logout, password change |
| **ML Predictions** | Cache-aside | 5 min | Model retrain |
| **Market Data** | Cache-aside | 1 min | Webhook from broker API |
| **User Profile** | Write-through | 1 hour | Profile update |
| **Leaderboards** | Cache-aside | 10 min | Cron job rebuild |
| **Historical OHLCV** | Cache-aside | 24 hours | Never (immutable) |
| **Rate Limit** | Counter | 1 min | Auto-expire |
### TTL Configuration
```typescript
// apps/backend/src/config/cache.ts
export const CACHE_TTL = {
SESSION: 60 * 60 * 24 * 7, // 7 days
ML_PREDICTION: 60 * 5, // 5 minutes
MARKET_DATA: 60, // 1 minute
USER_PROFILE: 60 * 60, // 1 hour
LEADERBOARD: 60 * 10, // 10 minutes
OHLCV_HISTORICAL: 60 * 60 * 24, // 24 hours
RATE_LIMIT: 60, // 1 minute
} as const;
```
### Redis Key Naming Convention
```
{app}:{entity}:{id}:{version}
Examples:
- session:user:123abc:v1
- ml:prediction:AAPL:1d:v2
- market:ohlcv:TSLA:2025-12-06:v1
- user:profile:456def:v1
- leaderboard:monthly:2025-12:v1
- ratelimit:api:192.168.1.1:v1
```
### Cache-Aside Pattern (Read-Heavy)
```typescript
// apps/backend/src/services/market.service.ts
async function getOHLCV(symbol: string, date: string) {
const cacheKey = `market:ohlcv:${symbol}:${date}:v1`;
// 1. Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// 2. Cache miss → fetch from DB
const data = await db.ohlcv.findUnique({
where: { symbol, date }
});
// 3. Store in cache
await redis.setex(
cacheKey,
CACHE_TTL.OHLCV_HISTORICAL,
JSON.stringify(data)
);
return data;
}
```
### Write-Through Pattern (Write-Heavy)
```typescript
// apps/backend/src/services/user.service.ts
async function updateProfile(userId: string, data: ProfileUpdate) {
const cacheKey = `user:profile:${userId}:v1`;
// 1. Write to DB first
const updated = await db.user.update({
where: { id: userId },
data
});
// 2. Update cache
await redis.setex(
cacheKey,
CACHE_TTL.USER_PROFILE,
JSON.stringify(updated)
);
return updated;
}
```
### Rate Limiting
```typescript
// apps/backend/src/middleware/rateLimit.ts
async function rateLimit(req: Request, res: Response, next: NextFunction) {
const ip = req.ip;
const key = `ratelimit:api:${ip}:v1`;
const current = await redis.incr(key);
if (current === 1) {
await redis.expire(key, CACHE_TTL.RATE_LIMIT);
}
if (current > 100) { // 100 requests per minute
return res.status(429).json({
error: 'Too many requests, try again later'
});
}
next();
}
```
### ML Predictions Cache
```python
# apps/ml-engine/cache.py
import redis
import json
from datetime import timedelta
redis_client = redis.Redis(host='localhost', port=6379)
async def get_prediction(symbol: str, timeframe: str):
cache_key = f"ml:prediction:{symbol}:{timeframe}:v2"
# Try cache
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Generate prediction (expensive)
prediction = await model.predict(symbol, timeframe)
# Cache for 5 minutes
redis_client.setex(
cache_key,
timedelta(minutes=5),
json.dumps(prediction)
)
return prediction
```
### Cache Invalidation
```typescript
// apps/backend/src/services/cache.service.ts
class CacheInvalidator {
// Invalidate single key
async invalidate(key: string) {
await redis.del(key);
}
// Invalidate pattern (use with caution)
async invalidatePattern(pattern: string) {
const keys = await redis.keys(pattern);
if (keys.length > 0) {
await redis.del(...keys);
}
}
// Invalidate on model retrain
async invalidateMLPredictions() {
await this.invalidatePattern('ml:prediction:*');
}
// Invalidate user session
async invalidateSession(userId: string) {
await this.invalidatePattern(`session:user:${userId}:*`);
}
}
```
---
## Consecuencias
### Positivas
1. **Performance**: API responses < 100ms (cached), vs 500ms (uncached)
2. **Cost Savings**: 90% menos requests a APIs externas de market data
3. **Scalability**: Redis soporta 100K ops/sec en hardware modesto
4. **Rate Limiting**: Previene abuse sin afectar usuarios legítimos
5. **ML Efficiency**: Predicciones cached reducen load en GPU
6. **Developer Experience**: Redis CLI para debugging de cache
7. **Session Management**: Logout instantáneo vía invalidación
### Negativas
1. **Stale Data**: Cache puede servir datos desactualizados por max TTL
2. **Memory Cost**: Redis requiere ~2GB RAM para 10K usuarios activos
3. **Complexity**: Cache invalidation es "one of the two hard problems"
4. **Cold Start**: Primeros requests son lentos (cache warming needed)
5. **Thundering Herd**: Múltiples requests simultáneos al expirar cache
6. **Redis SPOF**: Si Redis cae, performance degrada (no falla, pero lento)
### Riesgos y Mitigaciones
| Riesgo | Mitigación |
|--------|-----------|
| Cache invalidation bugs | Unit tests + integration tests de cache |
| Thundering herd | Implement cache locking (SETNX pattern) |
| Redis down | Graceful degradation: app funciona sin cache |
| Memory exhausted | Eviction policy: `allkeys-lru` |
| Stale ML predictions | Short TTL (5 min) + manual invalidation |
---
## Alternativas Consideradas
### 1. Memcached
- **Pros**: Simple, muy rápido, menos memoria que Redis
- **Contras**: No persistence, no data structures, no pub/sub
- **Decisión**: Descartada - Redis ofrece más features por mismo costo
### 2. Application-Level Cache (Node.js Map)
- **Pros**: Cero latencia, no external dependency
- **Contras**: No distribuido, se pierde al reiniciar app
- **Decisión**: Descartada - No escala a múltiples instancias
### 3. CDN Caching (Cloudflare/CloudFront)
- **Pros**: Global distribution, DDoS protection
- **Contras**: Solo para static assets, no funciona para API
- **Decisión**: Complementario - Usar para frontend, no para API
### 4. Database Query Cache (PostgreSQL)
- **Pros**: Ya incluido en Postgres
- **Contras**: Limitado, no configurable por query
- **Decisión**: Insuficiente - Necesitamos cache más agresivo
### 5. GraphQL + DataLoader
- **Pros**: Batching automático, cache per-request
- **Contras**: Requiere migrar de REST a GraphQL
- **Decisión**: Descartada - REST es suficiente para MVP
### 6. No Cache (Optimizar DB Queries)
- **Pros**: Menos complejidad, no stale data
- **Contras**: Imposible alcanzar < 200ms para ML predictions
- **Decisión**: Descartada - Performance requirements no alcanzables
---
## Cache Warming Strategy
```typescript
// apps/backend/src/jobs/cache-warmer.ts
import cron from 'node-cron';
// Run every 5 minutes
cron.schedule('*/5 * * * *', async () => {
const topSymbols = ['AAPL', 'TSLA', 'GOOGL', 'MSFT', 'AMZN'];
for (const symbol of topSymbols) {
// Pre-cache ML predictions for popular symbols
await mlService.getPrediction(symbol, '1d');
await mlService.getPrediction(symbol, '1w');
// Pre-cache market data
await marketService.getOHLCV(symbol, today());
}
// Pre-cache leaderboards
await leaderboardService.getMonthly();
});
```
---
## Monitoring
### Redis Metrics to Track
```
- Hit/Miss Ratio (target > 80%)
- Evicted Keys (should be 0)
- Memory Usage (alert at > 80%)
- Connected Clients
- Commands/sec
```
### Logging
```typescript
// Log cache hits/misses
logger.info('cache_hit', { key, ttl: remaining });
logger.warn('cache_miss', { key, reason: 'expired' });
```
---
## Referencias
- [Redis Documentation](https://redis.io/docs/)
- [Cache Strategies](https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/Strategies.html)
- [Cache Invalidation Patterns](https://martinfowler.com/bliki/TwoHardThings.html)
- [Redis Best Practices](https://redis.io/docs/manual/patterns/)