Cache Multi-Shard Read Analysis & Design
Executive Summary
This document analyzes which cache types need multi-shard read support, how they should be sharded, and designs a central multi-shard read system with timestamp-based indexing for pagination.
1. Cache Type Analysis
1.1 Internal Logs Cache ✅ NEEDS MULTI-SHARD
Current State:
- ✅ Already uses monthly sharding (
monthly: true) - ✅ Already implements multi-shard reads via
getRegistryShards() - ✅ Has timestamp field in entries
- ❌ Missing: Index key per shard for fast key listing (
listLogKeys()is stub)
Sharding Strategy: Month-based (already implemented)
- Registry:
registry:cache:internal:2025-02 - Shard:
cache:internal:shard:${random16} - Key:
internal:log:${uuid}
Why Multi-Shard:
- Logs accumulate over time (monthly rotation)
- Admin queries need to read across months
- Each month can have multiple shards if volume grows
Indexing Needs:
- Index key per shard:
internal:index:${shardId}→ Array of{id, timestamp}sorted by timestamp DESC - Purpose: Fast listing without
storage.list()(DO doesn't expose to Workers) - Update: Append to index on each log write
Pagination:
- Current:
limitonly (no cursor) - Needed: Timestamp-based cursor (
cursor: ${timestamp}:${id}) - Query: Filter by
fromTimestamp/toTimestamp, sort DESC, apply cursor
1.2 Maintenance Cache ⚠️ MAYBE MULTI-SHARD
Current State:
- ✅ Single shard (active shard only)
- ✅ Keys:
maintenance:${requestId} - ✅ Data has
createdAt,updatedAt,dueDatetimestamps - ❌ Cache is only for single-request lookups (not list queries)
- ❌ List queries go to DB, not cache
Sharding Strategy: ID-based (via registry, not hash)
- Registry manages shards automatically when size limit reached
- No monthly sharding needed (requests don't accumulate like logs)
Why Multi-Shard (if implemented):
- If we want to cache list query results (e.g., "all requests for user X")
- If maintenance cache grows beyond one shard limit
- Currently: Single-request cache hits don't need multi-shard
Indexing Needs:
- Only if caching list queries: Index by
userId,departmentId,status, etc. - Current: No indexing needed (single-key lookups only)
Pagination:
- Current: DB handles pagination (offset/limit)
- Cache: No pagination (single requests only)
Recommendation: Keep single-shard for now. Add multi-shard only if:
- We start caching list query results
- Single shard hits size limit
1.3 Auth Rate Limit Cache ❌ NO MULTI-SHARD
Current State:
- ✅ Single shard
- ✅ Keys:
auth:attempts:${identifier}(email/userId) - ✅ Data:
{count: number, lastAttempt: number} - ✅ Per-identifier lookups only
Sharding Strategy: N/A (single shard sufficient)
Why No Multi-Shard:
- Per-identifier lookups (no list queries)
- TTL-based expiration (5min success, 1hr failure)
- No historical queries needed
Indexing Needs: None
Pagination: None needed
1.4 Analytics Cache ⚠️ MAYBE MULTI-SHARD
Current State:
- ✅ Single shard
- ✅ Keys: Query-based (
analytics:${fromDate}-${toDate}) - ✅ Caches aggregated analytics results
Sharding Strategy: Month-based (if implemented)
- Could shard by month for historical analytics
- Current: Single shard caches query results
Why Multi-Shard (if implemented):
- If we want to aggregate analytics across months
- If we store raw analytics events (not just aggregated results)
Indexing Needs:
- Only if storing raw events: Timestamp-based index
- Current: Query-based caching (no indexing needed)
Pagination:
- Current: Query-based (no pagination)
- If raw events: Timestamp-based cursor
Recommendation: Keep single-shard for now. Add monthly sharding only if we store raw analytics events.
2. Sharding Strategy Summary
| Cache Type | Multi-Shard? | Strategy | Reason |
|---|---|---|---|
| Internal Logs | ✅ YES | Month-based | Logs accumulate, admin queries across months |
| Maintenance | ⚠️ MAYBE | ID-based (registry) | Only if caching list queries or size limit |
| Auth Rate Limits | ❌ NO | Single shard | Per-identifier only, TTL-based |
| Analytics | ⚠️ MAYBE | Month-based | Only if storing raw events |
3. Timestamp-Based Indexing Analysis
3.1 Internal Logs ✅ NEEDS INDEXING
Current: listLogKeys() returns [] (stub)
Required:
- Index key per shard:
internal:index:${shardId} - Structure:
Array<{id: string, timestamp: number}>sorted DESC - Update: Append on each log write
- Read: Use index to get log IDs, then fetch entries
Implementation:
// On write:
const indexKey = `internal:index:${shardId}`;
const index = await cache.get(shardId, indexKey) || [];
index.push({id: logEntry.id, timestamp: logEntry.timestamp});
index.sort((a, b) => b.timestamp - a.timestamp); // DESC
await cache.set(shardId, indexKey, index, 0);
// On read:
const index = await cache.get(shardId, indexKey) || [];
const logIds = index.map(e => e.id);
// Fetch entries by IDs, filter by timestamp rangePagination:
- Cursor format:
cursor:${timestamp}:${id} - Query: Filter index by
fromTimestamp/toTimestamp, apply cursor, limit
3.2 Maintenance Cache ❌ NO INDEXING NEEDED
Reason: Cache is for single-request lookups only. List queries go to DB.
If we add list caching later:
- Index by
userId,departmentId,status, etc. - Timestamp-based index for date range queries
3.3 Auth Rate Limits ❌ NO INDEXING NEEDED
Reason: Per-identifier lookups only, no list queries.
3.4 Analytics Cache ❌ NO INDEXING NEEDED (CURRENT)
Reason: Query-based caching (aggregated results), not raw events.
If we add raw event storage:
- Timestamp-based index per month shard
- Similar to internal logs
4. Central Multi-Shard Read Function Design
4.1 Requirements
- Central function (like registry does central registry stuff)
- Handles multi-shard reads for caches that need it
- Timestamp-based filtering (fromTimestamp/toTimestamp)
- Cursor-based pagination (timestamp:id cursor)
- Filtering (level, category, service, etc.)
- Sorting (timestamp DESC default)
4.2 Design: MultiShardReader Class
Location: lib/services/cache/multi-shard-reader/index.ts
Responsibilities:
- Get shards from registry (
getRegistryShards()) - Read from each shard (using index keys)
- Filter by timestamp range
- Apply cursor pagination
- Merge and sort results
- Return paginated results with next cursor
API:
interface MultiShardReadOptions<T> {
cacheType: string;
monthly?: boolean;
indexKeyPrefix: string; // e.g., "internal:index"
entryKeyPrefix: string; // e.g., "internal:log"
filters?: {
fromTimestamp?: number;
toTimestamp?: number;
// ... other filters
};
cursor?: string; // "timestamp:id"
limit?: number;
transform?: (entry: unknown) => T; // Transform cache entry to T
}
interface MultiShardReadResult<T> {
entries: T[];
nextCursor?: string;
hasMore: boolean;
shardsRead: number;
}
class MultiShardReader {
async read<T>(options: MultiShardReadOptions<T>): Promise<MultiShardReadResult<T>>;
}5. Implementation Plan
Phase 1: Index Key Support for Internal Logs
- ✅ Update
InternalCacheService.log()to update index key - ✅ Implement
listLogKeys()using index key - ✅ Update
getLogsFromShard()to use index
Phase 2: Central Multi-Shard Reader
- ✅ Create
MultiShardReaderclass - ✅ Implement timestamp-based filtering
- ✅ Implement cursor pagination
- ✅ Update
InternalCacheService.getLogs()to useMultiShardReader
Phase 3: Future (if needed)
- Add multi-shard support for maintenance (if caching list queries)
- Add monthly sharding for analytics (if storing raw events)
6. Recommendations
✅ DO NOW:
- Internal logs indexing: Implement index key per shard
- Central multi-shard reader: Create reusable function
- Cursor pagination: Add to internal logs
⚠️ DEFER:
- Maintenance multi-shard: Only if caching list queries
- Analytics monthly sharding: Only if storing raw events
❌ DON'T:
- Auth rate limits multi-shard: Not needed (per-identifier only)
7. Key Decisions
- Index keys stored in same shard (not separate index shard) - simpler, co-located
- Cursor format:
timestamp:id(not just timestamp) - handles duplicates - Central reader (not per-cache) - DRY, consistent behavior
- Timestamp-based only (not ID-based) - aligns with time-series data