Caching and CDN Integration
The Unbroken Protocol is designed to work efficiently with CDNs, proxies, and HTTP caches. This page documents caching strategies and CDN integration patterns.
Caching Model
Different operations have different caching characteristics:
| Operation | Cacheable | Reason |
|---|---|---|
| Catch-up reads | Yes | Historical data is immutable |
| Long-poll (200) | Conditionally | New data at specific offset |
| Long-poll (204) | No | Timeout, not meaningful to cache |
| SSE | No | Streaming response |
| HEAD | No | Metadata changes frequently |
| POST/PUT/DELETE | No | Write operations |
Catch-Up Read Caching
Historical data is immutable and highly cacheable.
Server Headers
Servers SHOULD return appropriate Cache-Control headers:
GET /stream?offset=abc123 HTTP/1.1HTTP/1.1 200 OKContent-Type: application/jsonCache-Control: public, max-age=60, stale-while-revalidate=300ETag: "stream_123:abc123:def456"Stream-Next-Offset: def456
[{"event": "data"}]Cache-Control Recommendations
For public streams (no authentication):
Cache-Control: public, max-age=60, stale-while-revalidate=300For private streams (user-specific):
Cache-Control: private, max-age=60, stale-while-revalidate=300ETag Format
Servers SHOULD include ETags for cache validation:
ETag: "{stream_id}:{start_offset}:{end_offset}"Clients can use If-None-Match for conditional requests:
GET /stream?offset=abc123 HTTP/1.1If-None-Match: "stream_123:abc123:def456"HTTP/1.1 304 Not ModifiedLong-Poll Caching
Long-poll requests can be cached to enable request collapsing.
Request Collapsing
Multiple clients waiting for the same data can share a single upstream request:
┌─────────┐│ Client1 │──┐└─────────┘ │ │┌─────────┐ │ ┌─────┐ ┌────────┐│ Client2 │──┼───▶│ CDN │────────▶│ Origin │└─────────┘ │ └─────┘ └────────┘ │ │┌─────────┐ │ │ One request upstream│ Client3 │──┘ │ Response fanned out to all└─────────┘ ▼Cursor Mechanism
The Stream-Cursor header enables collapsing:
# First requestGET /stream?offset=abc123&live=long-poll HTTP/1.1HTTP/1.1 200 OKStream-Next-Offset: def456Stream-Cursor: c_12345
[{"event": "new-data"}]# Subsequent request includes cursorGET /stream?offset=def456&live=long-poll&cursor=c_12345 HTTP/1.1Clients at the same offset with the same cursor are candidates for collapsing.
Cache Keys
CDNs should key long-poll caches on:
- Stream URL
- Offset parameter
- Cursor parameter (if present)
- Authentication headers (for private streams)
Timeout Responses
Servers SHOULD NOT cache 204 No Content responses:
HTTP/1.1 204 No ContentCache-Control: no-storeStream-Next-Offset: abc123SSE Connection Cycling
SSE connections should be closed periodically to enable CDN edge caching.
Connection Lifecycle
┌────────┐ ┌─────┐ ┌────────┐│ Client │ │ CDN │ │ Origin │└───┬────┘ └──┬──┘ └───┬────┘ │ │ │ │ GET ?offset=X&live=sse │ │ │─────────────────────────────────────▶│ │ │ │ GET ?offset=X&live=sse │ │ │──────────────────────────▶│ │ │ │ │ │ event: data ... │ │ event: data (from edge cache) │◀──────────────────────────│ │◀─────────────────────────────────────│ │ │ │ │ │ ... ~60 seconds ... │ │ │ │ │ │ Connection closed │ Connection closed │ │◀─────────────────────────────────────│◀──────────────────────────│ │ │ │ │ GET ?offset=Y&live=sse │ │ │─────────────────────────────────────▶│ │ │ │ │ │ Served from edge (collapsed) │ │ │◀─────────────────────────────────────│ │Recommended Cycle Time
Servers SHOULD close SSE connections every ~60 seconds. This:
- Enables new clients to join existing edge requests
- Prevents connection state accumulation
- Allows CDN health checks to function
HEAD Request Caching
Metadata requests should not be cached:
HEAD /stream HTTP/1.1HTTP/1.1 200 OKCache-Control: no-storeStream-Next-Offset: xyz789The tail offset changes with every append, so caching would return stale data.
CDN Configuration
Cloudflare
# Cache catch-up readsPage Rule: /stream?offset=* Cache Level: Cache Everything Edge Cache TTL: 60 seconds
# Don't cache long-poll timeoutsPage Rule: /stream?*live=long-poll* Cache Level: Standard (respect origin headers)
# Don't cache SSEPage Rule: /stream?*live=sse* Cache Level: BypassAWS CloudFront
CacheBehaviors: - PathPattern: "/stream*" QueryStringCacheKeys: - offset - cursor TTL: DefaultTTL: 60 MaxTTL: 300Fastly
sub vcl_recv { if (req.url ~ "\?.*live=sse") { return(pass); } if (req.url ~ "\?.*live=long-poll") { # Cache only 200 responses set req.http.X-Cache-Long-Poll = "true"; }}
sub vcl_backend_response { if (req.http.X-Cache-Long-Poll && beresp.status == 204) { set beresp.ttl = 0s; set beresp.uncacheable = true; }}Cache Invalidation
The append-only nature of streams simplifies invalidation:
- No invalidation needed for catch-up reads (data is immutable)
- Natural expiration via TTL is sufficient
- New data is served from origin on first request
Purging
Stream deletion may require cache purging:
# Purge all cached data for a streamcurl -X PURGE https://cdn.example.com/stream/*Security Considerations
Authentication with Caching
For authenticated streams, include credentials in cache keys:
Cache key = URL + offset + Authorization headerOr use Cache-Control: private to prevent shared caching.
Cache Poisoning
Validate offset parameters before caching to prevent:
- Invalid offset injection
- Cross-stream cache pollution
Monitoring
Track these metrics for cache optimization:
| Metric | Target | Action if Off |
|---|---|---|
| Cache hit ratio (catch-up) | >80% | Increase TTL |
| Long-poll collapse ratio | >50% | Verify cursor propagation |
| SSE reconnect rate | ~1/minute | Check connection cycling |
Next Steps
- Reading Operations - All read modes
- Protocol Overview - Complete reference
- Quick Start - Hands-on examples