Skip to content

Caching and CDN Integration

The Unbroken Protocol is designed to work efficiently with CDNs, proxies, and HTTP caches. This page documents caching strategies and CDN integration patterns.

Caching Model

Different operations have different caching characteristics:

OperationCacheableReason
Catch-up readsYesHistorical data is immutable
Long-poll (200)ConditionallyNew data at specific offset
Long-poll (204)NoTimeout, not meaningful to cache
SSENoStreaming response
HEADNoMetadata changes frequently
POST/PUT/DELETENoWrite operations

Catch-Up Read Caching

Historical data is immutable and highly cacheable.

Server Headers

Servers SHOULD return appropriate Cache-Control headers:

GET /stream?offset=abc123 HTTP/1.1
HTTP/1.1 200 OK
Content-Type: application/json
Cache-Control: public, max-age=60, stale-while-revalidate=300
ETag: "stream_123:abc123:def456"
Stream-Next-Offset: def456
[{"event": "data"}]

Cache-Control Recommendations

For public streams (no authentication):

Cache-Control: public, max-age=60, stale-while-revalidate=300

For private streams (user-specific):

Cache-Control: private, max-age=60, stale-while-revalidate=300

ETag Format

Servers SHOULD include ETags for cache validation:

ETag: "{stream_id}:{start_offset}:{end_offset}"

Clients can use If-None-Match for conditional requests:

GET /stream?offset=abc123 HTTP/1.1
If-None-Match: "stream_123:abc123:def456"
HTTP/1.1 304 Not Modified

Long-Poll Caching

Long-poll requests can be cached to enable request collapsing.

Request Collapsing

Multiple clients waiting for the same data can share a single upstream request:

┌─────────┐
│ Client1 │──┐
└─────────┘ │
┌─────────┐ │ ┌─────┐ ┌────────┐
│ Client2 │──┼───▶│ CDN │────────▶│ Origin │
└─────────┘ │ └─────┘ └────────┘
│ │
┌─────────┐ │ │ One request upstream
│ Client3 │──┘ │ Response fanned out to all
└─────────┘ ▼

Cursor Mechanism

The Stream-Cursor header enables collapsing:

# First request
GET /stream?offset=abc123&live=long-poll HTTP/1.1
HTTP/1.1 200 OK
Stream-Next-Offset: def456
Stream-Cursor: c_12345
[{"event": "new-data"}]
# Subsequent request includes cursor
GET /stream?offset=def456&live=long-poll&cursor=c_12345 HTTP/1.1

Clients at the same offset with the same cursor are candidates for collapsing.

Cache Keys

CDNs should key long-poll caches on:

  1. Stream URL
  2. Offset parameter
  3. Cursor parameter (if present)
  4. Authentication headers (for private streams)

Timeout Responses

Servers SHOULD NOT cache 204 No Content responses:

HTTP/1.1 204 No Content
Cache-Control: no-store
Stream-Next-Offset: abc123

SSE Connection Cycling

SSE connections should be closed periodically to enable CDN edge caching.

Connection Lifecycle

┌────────┐ ┌─────┐ ┌────────┐
│ Client │ │ CDN │ │ Origin │
└───┬────┘ └──┬──┘ └───┬────┘
│ │ │
│ GET ?offset=X&live=sse │ │
│─────────────────────────────────────▶│ │
│ │ GET ?offset=X&live=sse │
│ │──────────────────────────▶│
│ │ │
│ │ event: data ... │
│ event: data (from edge cache) │◀──────────────────────────│
│◀─────────────────────────────────────│ │
│ │ │
│ ... ~60 seconds ... │ │
│ │ │
│ Connection closed │ Connection closed │
│◀─────────────────────────────────────│◀──────────────────────────│
│ │ │
│ GET ?offset=Y&live=sse │ │
│─────────────────────────────────────▶│ │
│ │ │
│ Served from edge (collapsed) │ │
│◀─────────────────────────────────────│ │

Servers SHOULD close SSE connections every ~60 seconds. This:

  • Enables new clients to join existing edge requests
  • Prevents connection state accumulation
  • Allows CDN health checks to function

HEAD Request Caching

Metadata requests should not be cached:

HEAD /stream HTTP/1.1
HTTP/1.1 200 OK
Cache-Control: no-store
Stream-Next-Offset: xyz789

The tail offset changes with every append, so caching would return stale data.

CDN Configuration

Cloudflare

# Cache catch-up reads
Page Rule: /stream?offset=*
Cache Level: Cache Everything
Edge Cache TTL: 60 seconds
# Don't cache long-poll timeouts
Page Rule: /stream?*live=long-poll*
Cache Level: Standard (respect origin headers)
# Don't cache SSE
Page Rule: /stream?*live=sse*
Cache Level: Bypass

AWS CloudFront

CacheBehaviors:
- PathPattern: "/stream*"
QueryStringCacheKeys:
- offset
- cursor
TTL:
DefaultTTL: 60
MaxTTL: 300

Fastly

sub vcl_recv {
if (req.url ~ "\?.*live=sse") {
return(pass);
}
if (req.url ~ "\?.*live=long-poll") {
# Cache only 200 responses
set req.http.X-Cache-Long-Poll = "true";
}
}
sub vcl_backend_response {
if (req.http.X-Cache-Long-Poll && beresp.status == 204) {
set beresp.ttl = 0s;
set beresp.uncacheable = true;
}
}

Cache Invalidation

The append-only nature of streams simplifies invalidation:

  • No invalidation needed for catch-up reads (data is immutable)
  • Natural expiration via TTL is sufficient
  • New data is served from origin on first request

Purging

Stream deletion may require cache purging:

Terminal window
# Purge all cached data for a stream
curl -X PURGE https://cdn.example.com/stream/*

Security Considerations

Authentication with Caching

For authenticated streams, include credentials in cache keys:

Cache key = URL + offset + Authorization header

Or use Cache-Control: private to prevent shared caching.

Cache Poisoning

Validate offset parameters before caching to prevent:

  • Invalid offset injection
  • Cross-stream cache pollution

Monitoring

Track these metrics for cache optimization:

MetricTargetAction if Off
Cache hit ratio (catch-up)>80%Increase TTL
Long-poll collapse ratio>50%Verify cursor propagation
SSE reconnect rate~1/minuteCheck connection cycling

Next Steps