smart-ai-cache
A lightweight, intelligent caching middleware for AI responses, designed to reduce API costs and improve response times for repetitive LLM queries.
smart-ai-cache
is an NPM package targeting Node.js developers building AI-powered applications. Its primary goal is to reduce API costs and improve response times for repetitive Large Language Model (LLM) queries.
To install smart-ai-cache
in your project, use npm or yarn:
npm install smart-ai-cache
# or
yarn add smart-ai-cache
Here's how to quickly get started with smart-ai-cache
:
import { AIResponseCache } from 'smart-ai-cache';
import OpenAI from 'openai';
// Initialize the cache with a default TTL of 1 hour (3600 seconds)
const cache = new AIResponseCache({ ttl: 3600 });
// Initialize your OpenAI client (or any other AI provider client)
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function getCachedCompletion() {
const response = await cache.wrap(
() => openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello, world!' }],
}),
{ provider: 'openai', model: 'gpt-4' }
);
console.log('Response:', response.choices[0].message.content);
// Subsequent calls with the same prompt will be served from cache
const cachedResponse = await cache.wrap(
() => openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello, world!' }],
}),
{ provider: 'openai', model: 'gpt-4' }
);
console.log('Cached Response:', cachedResponse.choices[0].message.content);
// Check cache statistics
const stats = cache.getStats();
console.log(`Cache Hits: ${stats.cacheHits}, Cache Misses: ${stats.cacheMisses}`);
}
getCachedCompletion();
The AIResponseCache
constructor accepts an optional CacheConfig
object:
interface CacheConfig {
ttl?: number; // Time to live in seconds (default: 3600)
maxSize?: number; // Maximum cache entries (default: 1000)
storage?: 'memory' | 'redis'; // Storage backend (default: 'memory')
redisOptions?: RedisOptions; // Redis connection options (requires 'ioredis')
keyPrefix?: string; // Cache key prefix (default: 'ai-cache:')
enableStats?: boolean; // Enable statistics tracking (default: true)
debug?: boolean; // Enable debug logging (default: false)
}
Example with custom configuration:
import { AIResponseCache } from 'smart-ai-cache';
const customCache = new AIResponseCache({
ttl: 7200, // Cache entries expire after 2 hours
maxSize: 5000, // Allow up to 5000 entries in memory
enableStats: true,
debug: false,
});
For a complete and detailed API reference, including all classes, methods, and interfaces, please refer to the TypeDoc documentation.
smart-ai-cache
provides specialized classes for popular AI providers to simplify integration and enable automatic cost tracking.
import { OpenAICache } from 'smart-ai-cache';
import OpenAI from 'openai';
const openaiClient = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const openaiCache = new OpenAICache({
ttl: 7200,
maxSize: 5000,
enableStats: true
}, openaiClient); // Pass the initialized OpenAI client
async function getOpenAIChatCompletion() {
const response = await openaiCache.chatCompletion({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Explain quantum computing' }],
temperature: 0.7,
});
console.log('OpenAI Response:', response.choices[0].message.content);
// Get statistics specific to this cache instance
const stats = openaiCache.getStats();
console.log(`OpenAI Cache hit rate: ${stats.hitRate}%`);
console.log(`OpenAI Cost saved: $${stats.totalCostSaved.toFixed(2)}`);
}
getOpenAIChatCompletion();
import { AnthropicCache } from 'smart-ai-cache';
import Anthropic from '@anthropic-ai/sdk';
const anthropicClient = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const anthropicCache = new AnthropicCache({
ttl: 3600,
}, anthropicClient); // Pass the initialized Anthropic client
async function getAnthropicMessage() {
const response = await anthropicCache.messages({
model: 'claude-3-sonnet-20240229',
max_tokens: 100,
messages: [{ role: 'user', content: 'Tell me a short story.' }],
});
console.log('Anthropic Response:', response.content[0].text);
const stats = anthropicCache.getStats();
console.log(`Anthropic Cache hit rate: ${stats.hitRate}%`);
console.log(`Anthropic Cost saved: $${stats.totalCostSaved.toFixed(2)}`);
}
getAnthropicMessage();
import { GoogleCache } from 'smart-ai-cache';
import { GoogleGenerativeAI } from '@google/generative-ai';
// GoogleCache can take the API key directly or read from process.env.GOOGLE_API_KEY
const googleCache = new GoogleCache({
ttl: 3600,
}, process.env.GOOGLE_API_KEY);
async function getGoogleContent() {
const response = await googleCache.generateContent({
model: 'gemini-pro',
contents: [{ role: 'user', parts: [{ text: 'What is the capital of France?' }] }],
});
console.log('Google Response:', response.response.text());
const stats = googleCache.getStats();
console.log(`Google Cache hit rate: ${stats.hitRate}%`);
// Note: Cost saving for Google models is not yet implemented due to API limitations.
console.log(`Google Cost saved: $${stats.totalCostSaved.toFixed(2)}`);
}
getGoogleContent();
Cache keys are generated to ensure uniqueness and consistency across requests. The format is:
{prefix}:{provider}:{model}:{promptHash}:{paramsHash}
openai
, anthropic
, google
gpt-4
, claude-3-opus
, gemini-pro
This strategy ensures that identical requests (same provider, model, prompt, and parameters) result in the same cache key.
The AIResponseCache
and its provider-specific extensions provide a getStats()
method that returns a CacheStats
object:
interface CacheStats {
totalRequests: number;
cacheHits: number;
cacheMisses: number;
hitRate: number; // Percentage
totalCostSaved: number; // USD
averageResponseTime: number; // Milliseconds
lastResetTime: Date;
byProvider: {
[provider: string]: {
requests: number;
hits: number;
costSaved: number;
};
};
}
You can reset the statistics at any time using the resetStats()
method.
We welcome contributions! Please see our contributing guidelines (coming soon) for more information.
This project is licensed under the MIT License - see the LICENSE file for details.