- Prefijo v2: MCH - TRACEABILITY-MASTER.yml creado - Listo para integracion como submodulo Workspace: v2.0.0 | SIMCO: v4.0.0
336 lines
8.3 KiB
Markdown
336 lines
8.3 KiB
Markdown
---
|
|
id: INT-003
|
|
type: Integration
|
|
title: "OpenRouter (LLM Gateway)"
|
|
provider: "OpenRouter"
|
|
status: Activo
|
|
integration_type: "llm"
|
|
created_at: 2026-01-04
|
|
updated_at: 2026-01-10
|
|
simco_version: "3.8.0"
|
|
tags:
|
|
- openrouter
|
|
- llm
|
|
- ai
|
|
- multi-tenant
|
|
---
|
|
|
|
# INT-003: OpenRouter (LLM Gateway)
|
|
|
|
## Metadata
|
|
|
|
| Campo | Valor |
|
|
|-------|-------|
|
|
| **Codigo** | INT-003 |
|
|
| **Proveedor** | OpenRouter |
|
|
| **Tipo** | LLM / IA |
|
|
| **Estado** | Implementado |
|
|
| **Multi-tenant** | Si |
|
|
| **Fecha integracion** | 2026-01-06 |
|
|
| **Ultimo update** | 2026-01-10 |
|
|
| **Owner** | Backend Team |
|
|
|
|
---
|
|
|
|
## 1. Descripcion
|
|
|
|
OpenRouter es un gateway LLM que permite usar multiples modelos de IA (Claude, GPT-4, Mistral, etc.) con una sola API. MiChangarrito lo usa para el asistente de IA que responde consultas de los duenos de tiendas via WhatsApp.
|
|
|
|
**Casos de uso principales:**
|
|
- Responder consultas de ventas, inventario, ganancias
|
|
- Interpretar comandos de voz transcritos
|
|
- Generar respuestas contextualizadas
|
|
- Procesar lenguaje natural para operaciones
|
|
|
|
---
|
|
|
|
## 2. Credenciales Requeridas
|
|
|
|
### Variables de Entorno
|
|
|
|
| Variable | Descripcion | Tipo | Obligatorio |
|
|
|----------|-------------|------|-------------|
|
|
| `OPENROUTER_API_KEY` | API Key de OpenRouter | string | SI |
|
|
| `LLM_MODEL_DEFAULT` | Modelo por defecto | string | SI |
|
|
| `LLM_MODEL_FALLBACK` | Modelo de fallback | string | NO |
|
|
| `LLM_MAX_TOKENS` | Tokens maximos de respuesta | number | NO |
|
|
|
|
### Ejemplo de .env
|
|
|
|
```env
|
|
# OpenRouter / LLM
|
|
OPENROUTER_API_KEY=sk-or-v1-xxxxxxxxxxxx
|
|
LLM_MODEL_DEFAULT=anthropic/claude-3-haiku
|
|
LLM_MODEL_FALLBACK=openai/gpt-3.5-turbo
|
|
LLM_MAX_TOKENS=1000
|
|
```
|
|
|
|
### Obtencion de Credenciales
|
|
|
|
1. Crear cuenta en [OpenRouter](https://openrouter.ai/)
|
|
2. Agregar creditos (pay-per-use)
|
|
3. Generar API Key en dashboard
|
|
4. Seleccionar modelo por defecto
|
|
|
|
---
|
|
|
|
## 3. Endpoints/SDK Utilizados
|
|
|
|
### Operaciones Implementadas
|
|
|
|
| Operacion | Metodo | Endpoint | Descripcion |
|
|
|-----------|--------|----------|-------------|
|
|
| Chat Completion | POST | `/api/v1/chat/completions` | Generar respuesta |
|
|
| List Models | GET | `/api/v1/models` | Listar modelos disponibles |
|
|
|
|
### Modelos Soportados
|
|
|
|
| Modelo | ID | Costo aprox. | Uso |
|
|
|--------|----|--------------| ----|
|
|
| Claude 3 Haiku | `anthropic/claude-3-haiku` | $0.25/1M | Default (rapido, barato) |
|
|
| Claude 3 Sonnet | `anthropic/claude-3-sonnet` | $3/1M | Premium |
|
|
| GPT-3.5 Turbo | `openai/gpt-3.5-turbo` | $0.50/1M | Fallback |
|
|
| Mistral 7B | `mistralai/mistral-7b-instruct` | $0.06/1M | Economico |
|
|
|
|
### SDK/Cliente Utilizado
|
|
|
|
```typescript
|
|
// Cliente HTTP compatible con OpenAI
|
|
import OpenAI from 'openai';
|
|
|
|
const openrouter = new OpenAI({
|
|
apiKey: process.env.OPENROUTER_API_KEY,
|
|
baseURL: 'https://openrouter.ai/api/v1',
|
|
});
|
|
|
|
const response = await openrouter.chat.completions.create({
|
|
model: process.env.LLM_MODEL_DEFAULT,
|
|
messages: [
|
|
{ role: 'system', content: systemPrompt },
|
|
{ role: 'user', content: userMessage }
|
|
],
|
|
max_tokens: parseInt(process.env.LLM_MAX_TOKENS),
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Rate Limits
|
|
|
|
| Limite | Valor | Periodo | Accion si excede |
|
|
|--------|-------|---------|------------------|
|
|
| Requests | Depende de modelo | por minuto | 429 + retry |
|
|
| Creditos | Pay-per-use | N/A | Agregar creditos |
|
|
|
|
### Estrategia de Retry y Fallback
|
|
|
|
```typescript
|
|
async generateResponse(messages: Message[]): Promise<string> {
|
|
try {
|
|
// Intentar con modelo default
|
|
return await this.callLLM(process.env.LLM_MODEL_DEFAULT, messages);
|
|
} catch (error) {
|
|
if (this.isRateLimitError(error)) {
|
|
// Fallback a modelo mas economico
|
|
return await this.callLLM(process.env.LLM_MODEL_FALLBACK, messages);
|
|
}
|
|
throw error;
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Manejo de Errores
|
|
|
|
### Codigos de Error
|
|
|
|
| Codigo | Descripcion | Accion Recomendada | Retry |
|
|
|--------|-------------|-------------------|-------|
|
|
| 400 | Mensaje invalido | Validar input | NO |
|
|
| 401 | API key invalida | Verificar key | NO |
|
|
| 402 | Sin creditos | Agregar creditos | NO |
|
|
| 429 | Rate limited | Backoff + fallback | SI |
|
|
| 500 | Error del modelo | Retry con otro modelo | SI |
|
|
|
|
### Ejemplo de Manejo
|
|
|
|
```typescript
|
|
try {
|
|
const response = await this.llmService.generate(prompt);
|
|
return response;
|
|
} catch (error) {
|
|
this.logger.error('LLM error', {
|
|
service: 'openrouter',
|
|
model: this.model,
|
|
error: error.message,
|
|
tenantId: context.tenantId,
|
|
});
|
|
|
|
// Fallback a respuesta predefinida
|
|
return this.getFallbackResponse(intent);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Fallbacks
|
|
|
|
### Estrategia de Fallback Multi-nivel
|
|
|
|
| Nivel | Estrategia | Cuando |
|
|
|-------|------------|--------|
|
|
| 1 | Modelo fallback (GPT-3.5) | Rate limit o error modelo default |
|
|
| 2 | Respuestas predefinidas | Sin creditos o API caida |
|
|
| 3 | Mensaje de error amigable | Fallo total |
|
|
|
|
### Respuestas Predefinidas
|
|
|
|
```typescript
|
|
const fallbackResponses = {
|
|
ventas: 'Lo siento, no puedo consultar tus ventas en este momento. Intenta en unos minutos.',
|
|
inventario: 'El servicio de consulta esta temporalmente no disponible.',
|
|
default: 'Disculpa, no pude procesar tu consulta. Intenta de nuevo o escribe "ayuda".'
|
|
};
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Multi-tenant
|
|
|
|
### Modelo de Credenciales
|
|
|
|
- [x] **Por Tenant:** Cada tenant puede configurar su propia API key
|
|
- [x] **Global con fallback:** Si tenant no tiene key, usa la global
|
|
|
|
### Almacenamiento
|
|
|
|
```sql
|
|
-- En schema messaging
|
|
CREATE TABLE messaging.tenant_llm_config (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
tenant_id UUID REFERENCES auth.tenants(id),
|
|
provider VARCHAR(50) DEFAULT 'openrouter',
|
|
api_key TEXT, -- Encriptado
|
|
model_default VARCHAR(100),
|
|
max_tokens INTEGER DEFAULT 1000,
|
|
is_active BOOLEAN DEFAULT true,
|
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
|
|
UNIQUE(tenant_id)
|
|
);
|
|
```
|
|
|
|
### Contexto en Llamadas
|
|
|
|
```typescript
|
|
async generateForTenant(tenantId: UUID, prompt: string): Promise<string> {
|
|
const config = await this.getLLMConfig(tenantId);
|
|
|
|
// Usar config del tenant o fallback a global
|
|
const apiKey = config?.api_key || process.env.OPENROUTER_API_KEY;
|
|
const model = config?.model_default || process.env.LLM_MODEL_DEFAULT;
|
|
|
|
return this.generate(apiKey, model, prompt);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Consumo de Tokens
|
|
|
|
### Tracking de Tokens
|
|
|
|
```sql
|
|
CREATE TABLE subscriptions.token_usage (
|
|
id UUID PRIMARY KEY,
|
|
tenant_id UUID REFERENCES auth.tenants(id),
|
|
tokens_used INTEGER NOT NULL,
|
|
model VARCHAR(100),
|
|
operation VARCHAR(50),
|
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
### Precios de Tokens (MiChangarrito)
|
|
|
|
| Paquete | Precio | Tokens |
|
|
|---------|--------|--------|
|
|
| Basico | $29 MXN | 1,000 |
|
|
| Estandar | $69 MXN | 3,000 |
|
|
| Pro | $149 MXN | 8,000 |
|
|
| Empresarial | $299 MXN | 20,000 |
|
|
|
|
---
|
|
|
|
## 9. Testing
|
|
|
|
### Modo Test
|
|
|
|
| Ambiente | Configuracion |
|
|
|----------|---------------|
|
|
| Development | Modelo economico (Mistral 7B) |
|
|
| Staging | Modelo default (Claude Haiku) |
|
|
| Production | Modelo default con fallback |
|
|
|
|
### Comandos de Test
|
|
|
|
```bash
|
|
# Test directo a OpenRouter
|
|
curl -X POST https://openrouter.ai/api/v1/chat/completions \
|
|
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "anthropic/claude-3-haiku",
|
|
"messages": [{"role": "user", "content": "Hola, que hora es?"}]
|
|
}'
|
|
|
|
# Test via MCP Server
|
|
curl -X POST http://localhost:3142/chat \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"tenantId": "xxx", "message": "Cuanto vendi hoy?"}'
|
|
```
|
|
|
|
---
|
|
|
|
## 10. Monitoreo
|
|
|
|
### Metricas a Monitorear
|
|
|
|
| Metrica | Descripcion | Alerta |
|
|
|---------|-------------|--------|
|
|
| Latencia | Tiempo de respuesta | > 5s |
|
|
| Tokens/request | Promedio de tokens | > 500 |
|
|
| Error Rate | % de errores | > 5% |
|
|
| Costo diario | Gasto en USD | > $10 |
|
|
|
|
### Logs Estructurados
|
|
|
|
```typescript
|
|
this.logger.info('LLM request', {
|
|
service: 'openrouter',
|
|
model: model,
|
|
tenantId: context.tenantId,
|
|
promptTokens: usage.prompt_tokens,
|
|
completionTokens: usage.completion_tokens,
|
|
duration: durationMs,
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## 11. Referencias
|
|
|
|
### Documentacion Oficial
|
|
- [OpenRouter API Reference](https://openrouter.ai/docs)
|
|
- [Supported Models](https://openrouter.ai/models)
|
|
|
|
### Modulos Relacionados
|
|
- [MCP Server](../../apps/mcp-server/)
|
|
- [LLM Module](../../apps/backend/src/modules/llm/)
|
|
- [Arquitectura Multi-Tenant](../90-transversal/ARQUITECTURA-MULTI-TENANT-INTEGRACIONES.md)
|
|
|
|
---
|
|
|
|
**Ultima actualizacion:** 2026-01-10
|
|
**Autor:** Backend Team
|