Cache Multi-Shard Read Implementation Summary
✅ What Was Done
1. Analysis Document Created
File: docs/architecture/cache-multi-shard-analysis.md
Comprehensive analysis covering:
- Which caches need multi-shard reads (internal logs ✅, maintenance ⚠️, auth ❌, analytics ⚠️)
- Sharding strategies (month-based vs ID-based)
- Timestamp-based indexing requirements
- Pagination/cursor support needs
2. Central Multi-Shard Reader Implemented
File: apps/worker/src/lib/services/cache/multi-shard-reader/index.ts
Features:
- ✅ Central function (like registry does central registry stuff)
- ✅ Reads from multiple shards via registry
- ✅ Timestamp-based filtering (
fromTimestamp/toTimestamp) - ✅ Cursor-based pagination (
cursor: timestamp:id) - ✅ Custom filtering (level, category, service, etc.)
- ✅ Result merging and sorting (timestamp DESC)
- ✅ Returns paginated results with
nextCursorandhasMore
API:
typescript
const reader = createMultiShardReader(env);
const result = await reader.read<InternalLogEntry>({
cacheType: 'internal',
monthly: true,
indexKeyPrefix: 'internal:index',
entryKeyPrefix: 'internal:log',
filters: { fromTimestamp: 1234567890, toTimestamp: 1234567999 },
cursor: '1234567890:log-id-123',
limit: 100,
customFilter: (entry) => entry.level === 'error',
});3. Internal Logs Cache Updated
File: apps/worker/src/lib/services/cache/internal/index.ts
Changes:
- ✅ Added index key per shard (
internal:index:${shardId}) - ✅ Index stores
Array<{id: string, timestamp: number}>sorted DESC - ✅
updateIndexKey()updates index on each log write - ✅
getLogs()now usesMultiShardReaderinstead of manual shard iteration - ✅ Supports timestamp filtering, custom filters (level, category, service), and pagination
Index Key Structure:
typescript
// Key: internal:index:${shardId}
// Value: Array<{id: string, timestamp: number}>
[
{id: 'log-uuid-1', timestamp: 1737000000000},
{id: 'log-uuid-2', timestamp: 1736999999999},
// ... sorted DESC
]📊 Cache Type Status
| Cache Type | Multi-Shard? | Indexing? | Pagination? | Status |
|---|---|---|---|---|
| Internal Logs | ✅ YES | ✅ YES | ✅ YES | IMPLEMENTED |
| Maintenance | ⚠️ MAYBE | ❌ NO | ❌ NO | Single-shard (cache for single requests only) |
| Auth Rate Limits | ❌ NO | ❌ NO | ❌ NO | Single-shard (per-identifier only) |
| Analytics | ⚠️ MAYBE | ❌ NO | ❌ NO | Single-shard (query-based caching) |
🎯 Key Decisions
1. Sharding Strategy
- Internal logs: Month-based (already implemented)
- Maintenance: ID-based via registry (only if caching list queries)
- Auth: Single shard (not needed)
- Analytics: Month-based (only if storing raw events)
2. Indexing Strategy
- Index keys stored in same shard (not separate index shard)
- Format:
{cacheType}:index:${shardId}→Array<{id, timestamp}> - Update: Append on write, sort DESC by timestamp
- Purpose: Fast listing without
storage.list()(DO doesn't expose to Workers)
3. Pagination Strategy
- Cursor format:
timestamp:id(not just timestamp - handles duplicates) - Query: Filter index by timestamp range, apply cursor, limit
- Result: Returns entries +
nextCursor+hasMore
4. Central vs Per-Cache
- Central reader (like registry) - DRY, consistent behavior
- Reusable for any cache that needs multi-shard reads
- Customizable via
customFilterandtransformfunctions
🔧 How It Works
Writing Logs (Internal Cache Example)
typescript
// 1. Get active shard from registry
const shard = await registry.getActiveShard('internal', true); // monthly=true
// 2. Store log entry
await cache.set(shard, `internal:log:${logId}`, logEntry, ttl);
// 3. Update index key
const indexKey = `internal:index:${shard}`;
const index = await cache.get(shard, indexKey) || [];
index.push({id: logId, timestamp: logEntry.timestamp});
index.sort((a, b) => b.timestamp - a.timestamp); // DESC
await cache.set(shard, indexKey, index, ttl);Reading Logs (Multi-Shard)
typescript
// 1. Multi-shard reader gets all shards from registry
const shards = await registry.getRegistryShards('internal', {monthly: true});
// 2. For each shard:
// - Read index key
// - Filter by timestamp range
// - Apply cursor
// - Fetch entries by IDs
// - Apply custom filters
// 3. Merge all results, sort DESC, apply limit
// 4. Return entries + nextCursor + hasMore🚀 Next Steps (If Needed)
Maintenance Cache Multi-Shard (Future)
Only if:
- We start caching list query results (not just single requests)
- Single shard hits size limit
Implementation:
- Use
MultiShardReaderwithindexKeyPrefix: 'maintenance:index' - Index by
userId,departmentId,status, etc. (not just timestamp) - Or: Cache list query results with TTL (simpler)
Analytics Monthly Sharding (Future)
Only if:
- We store raw analytics events (not just aggregated results)
Implementation:
- Use monthly sharding (
monthly: true) - Use
MultiShardReaderfor cross-month aggregation - Index by timestamp for fast queries
📝 Usage Examples
Internal Logs - Get Recent Errors
typescript
const logs = await internalCache.getLogs({
level: 'error',
fromTimestamp: Date.now() - 24 * 60 * 60 * 1000, // Last 24h
limit: 50,
});Internal Logs - Paginated Query
typescript
// First page
const result1 = await multiShardReader.read({
cacheType: 'internal',
monthly: true,
indexKeyPrefix: 'internal:index',
entryKeyPrefix: 'internal:log',
filters: { fromTimestamp: startTime, toTimestamp: endTime },
limit: 100,
});
// Next page
const result2 = await multiShardReader.read({
// ... same options
cursor: result1.nextCursor, // Continue from cursor
});✅ Testing Checklist
- [ ] Internal logs: Write logs, verify index key updates
- [ ] Internal logs: Read logs with filters (level, category, service)
- [ ] Internal logs: Read logs with timestamp range
- [ ] Internal logs: Pagination with cursor
- [ ] Multi-shard: Read from multiple monthly shards
- [ ] Multi-shard: Merge and sort results correctly
- [ ] Edge cases: Empty shards, deleted entries, cursor boundaries
🎉 Summary
✅ Central multi-shard reader implemented (like registry pattern) ✅ Internal logs now use index keys and multi-shard reader ✅ Timestamp-based indexing for fast pagination ✅ Cursor pagination support ✅ Reusable for future caches (maintenance, analytics) if needed
Ready for testing! 🚀