- Add MetricsPage and useOnboarding hook - Update superadmin controller and service - Add module documentation (docs/01-modulos/) - Add CONTEXT-MAP.yml and Sprint 5 execution report - Update project status and task traces 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
200 lines
5.8 KiB
Markdown
200 lines
5.8 KiB
Markdown
# SAAS-006: Integracion IA
|
|
|
|
## Metadata
|
|
- **Codigo:** SAAS-006
|
|
- **Modulo:** AI Integration
|
|
- **Prioridad:** P1
|
|
- **Estado:** Pendiente
|
|
- **Fase:** 5 - Integraciones
|
|
|
|
## Descripcion
|
|
|
|
Wrapper agnostico para integracion con multiples proveedores de LLM (Claude, OpenAI, Gemini): configuracion por tenant, tracking de uso, rate limiting, y costo estimado.
|
|
|
|
## Objetivos
|
|
|
|
1. Abstraccion multi-proveedor
|
|
2. Configuracion por tenant
|
|
3. Tracking de tokens
|
|
4. Rate limiting
|
|
5. Estimacion de costos
|
|
|
|
## Alcance
|
|
|
|
### Incluido
|
|
- Soporte: Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google)
|
|
- Via OpenRouter como gateway
|
|
- Configuracion de modelo por tenant
|
|
- Conteo de tokens input/output
|
|
- Rate limiting por minuto y por mes
|
|
- Logs de uso
|
|
|
|
### Excluido
|
|
- Embeddings - fase posterior
|
|
- Fine-tuning
|
|
- Vision/imagenes - fase posterior
|
|
- Streaming - fase posterior
|
|
|
|
## Proveedores Soportados
|
|
|
|
| Proveedor | Modelos | Via |
|
|
|-----------|---------|-----|
|
|
| Anthropic | claude-3-opus, claude-3-sonnet, claude-3-haiku | OpenRouter |
|
|
| OpenAI | gpt-4-turbo, gpt-4, gpt-3.5-turbo | OpenRouter |
|
|
| Google | gemini-pro, gemini-1.5-pro | OpenRouter |
|
|
|
|
## Arquitectura
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ AI Service │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
|
│ │ Config │ │ Router │ │ Tracker │ │
|
|
│ │ per Tenant │ │ (Model) │ │ (Usage) │ │
|
|
│ └─────────────┘ └──────┬──────┘ └─────────────┘ │
|
|
│ │ │
|
|
│ ┌──────▼──────┐ │
|
|
│ │ OpenRouter │ │
|
|
│ │ Client │ │
|
|
│ └──────┬──────┘ │
|
|
│ │ │
|
|
└──────────────────────────┼──────────────────────────────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
│ │ │
|
|
┌────▼────┐ ┌─────▼─────┐ ┌────▼────┐
|
|
│ Claude │ │ GPT-4 │ │ Gemini │
|
|
└─────────┘ └───────────┘ └─────────┘
|
|
```
|
|
|
|
## Modelo de Datos
|
|
|
|
### Tablas (schema: ai)
|
|
|
|
**ai_configs**
|
|
- id, tenant_id, default_model
|
|
- system_prompt, temperature
|
|
- max_tokens, settings
|
|
|
|
**ai_usage**
|
|
- id, tenant_id, user_id
|
|
- model, input_tokens, output_tokens
|
|
- cost_usd, latency_ms
|
|
- created_at
|
|
|
|
## Endpoints API
|
|
|
|
| Metodo | Endpoint | Descripcion |
|
|
|--------|----------|-------------|
|
|
| POST | /ai/chat | Enviar mensaje |
|
|
| POST | /ai/complete | Completar texto |
|
|
| GET | /ai/models | Modelos disponibles |
|
|
| GET | /ai/config | Config del tenant |
|
|
| PUT | /ai/config | Actualizar config |
|
|
| GET | /ai/usage | Uso del periodo |
|
|
| GET | /ai/usage/history | Historial de uso |
|
|
|
|
## Interfaz del Servicio
|
|
|
|
```typescript
|
|
interface AIService {
|
|
chat(messages: Message[], options?: AIOptions): Promise<AIResponse>;
|
|
complete(prompt: string, options?: AIOptions): Promise<AIResponse>;
|
|
countTokens(text: string): number;
|
|
estimateCost(inputTokens: number, outputTokens: number, model: string): number;
|
|
}
|
|
|
|
interface AIOptions {
|
|
model?: string;
|
|
temperature?: number;
|
|
maxTokens?: number;
|
|
systemPrompt?: string;
|
|
}
|
|
|
|
interface AIResponse {
|
|
content: string;
|
|
model: string;
|
|
usage: {
|
|
inputTokens: number;
|
|
outputTokens: number;
|
|
totalTokens: number;
|
|
};
|
|
cost: number;
|
|
latencyMs: number;
|
|
}
|
|
```
|
|
|
|
## Rate Limiting
|
|
|
|
### Por Plan
|
|
| Plan | Tokens/min | Tokens/mes |
|
|
|------|------------|------------|
|
|
| Pro | 10,000 | 50,000 |
|
|
| Enterprise | 50,000 | 200,000 |
|
|
|
|
### Implementacion
|
|
```typescript
|
|
// Redis-based rate limiter
|
|
const key = `ai:ratelimit:${tenantId}`;
|
|
const current = await redis.incr(key);
|
|
if (current > limit) {
|
|
throw new RateLimitException('AI token limit exceeded');
|
|
}
|
|
```
|
|
|
|
## Costos Estimados
|
|
|
|
| Modelo | Input/1K | Output/1K |
|
|
|--------|----------|-----------|
|
|
| claude-3-haiku | $0.00025 | $0.00125 |
|
|
| claude-3-sonnet | $0.003 | $0.015 |
|
|
| gpt-4-turbo | $0.01 | $0.03 |
|
|
| gemini-pro | $0.00025 | $0.0005 |
|
|
|
|
## Entregables
|
|
|
|
| Entregable | Estado | Archivo |
|
|
|------------|--------|---------|
|
|
| ai.module.ts | Pendiente | `modules/ai/` |
|
|
| openrouter.client.ts | Pendiente | `clients/` |
|
|
| ai.service.ts | Pendiente | `services/` |
|
|
| usage.tracker.ts | Pendiente | `services/` |
|
|
|
|
## Dependencias
|
|
|
|
### Depende de
|
|
- SAAS-002 (Tenants)
|
|
- SAAS-005 (Plans - feature flag)
|
|
- OpenRouter API key
|
|
|
|
### Bloquea a
|
|
- Features de IA en otros modulos
|
|
|
|
## Criterios de Aceptacion
|
|
|
|
- [ ] Chat funciona con Claude
|
|
- [ ] Chat funciona con GPT-4
|
|
- [ ] Chat funciona con Gemini
|
|
- [ ] Tokens se cuentan correctamente
|
|
- [ ] Rate limiting funciona
|
|
- [ ] Uso se registra
|
|
|
|
## Configuracion
|
|
|
|
```typescript
|
|
{
|
|
ai: {
|
|
provider: 'openrouter',
|
|
apiKey: process.env.OPENROUTER_API_KEY,
|
|
defaultModel: 'anthropic/claude-3-haiku',
|
|
fallbackModel: 'openai/gpt-3.5-turbo',
|
|
timeout: 30000
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
**Ultima actualizacion:** 2026-01-07
|