Offsets
Offsets are tokens that identify positions within a stream. They enable the core resumability feature of the protocol.
Overview
An offset represents a position in the stream after a specific byte. Clients use offsets to:
- Resume reading after disconnection
- Track progress through a stream
- Coordinate multiple readers
Stream: [ message 1 ][ message 2 ][ message 3 ]Offsets: ^offset_0 ^offset_1 ^offset_2 ^offset_3 (tail)Properties
Offsets have three critical properties that clients can rely on:
1. Opaque
Clients MUST NOT interpret offset structure or meaning.
# These are all valid offsets from different implementations:0000000000000000_0000000000000042abc123def456chunk_007_byte_1024{"segment":3,"position":500}The internal format is implementation-defined and may change without notice.
2. Lexicographically Sortable
For any two offsets from the same stream, comparing them as strings determines their order:
// JavaScript exampleconst offset1 = "0000000000000000_0000000000000042"const offset2 = "0000000000000000_0000000000000100"
offset1 < offset2 // true - offset1 is earlier in the streamThis property enables clients to:
- Determine if they’ve seen data before
- Merge data from multiple reads
- Implement exactly-once processing
3. Persistent
Offsets remain valid for the lifetime of the stream:
- Valid until stream is deleted
- Valid until stream expires
- Valid even if data before the offset is dropped (retention)
Using Offsets
Initial Read
To read from the beginning, use offset -1 or omit the parameter:
GET /stream?offset=-1# orGET /streamSubsequent Reads
Use the Stream-Next-Offset header from the previous response:
# First readGET /stream?offset=-1
# Response includes: Stream-Next-Offset: abc123
# Second readGET /stream?offset=abc123
# Response includes: Stream-Next-Offset: def456
# Third readGET /stream?offset=def456Persistence
Clients SHOULD persist offsets to enable resumability:
// Browser - localStoragelocalStorage.setItem( "stream_offset", response.headers.get("Stream-Next-Offset"))
// Later, resume from saved offsetconst offset = localStorage.getItem("stream_offset") || "-1"fetch(`/stream?offset=${encodeURIComponent(offset)}`)# Python - file storagedef save_offset(offset): with open('.stream_offset', 'w') as f: f.write(offset)
def load_offset(): try: with open('.stream_offset', 'r') as f: return f.read().strip() except FileNotFoundError: return '-1'Special Values
| Value | Meaning | Usage |
|---|---|---|
-1 | Stream start | Read from beginning |
| Omitted | Same as -1 | Default behavior |
{token} | After position | Resume from here |
URL Encoding
Offsets may contain characters that require URL encoding. Clients MUST properly encode offset values:
// JavaScriptconst offset = "abc/123+456"const url = `/stream?offset=${encodeURIComponent(offset)}`// Result: /stream?offset=abc%2F123%2B456Character Restrictions
Servers SHOULD use offsets that avoid these characters:
,(comma)&(ampersand)=(equals)?(question mark)
This prevents conflicts with URL query parameter syntax.
Offset Comparison
Clients MAY compare offsets to determine ordering:
def is_newer(offset_a, offset_b): """Returns True if offset_a is after offset_b in the stream.""" return offset_a > offset_b # Lexicographic comparison
# Example usagecurrent = "0000000000000000_0000000000000100"received = "0000000000000000_0000000000000150"
if is_newer(received, current): process_data() current = receivedRetention and Gone Offsets
When a stream has retention policies, old data may be dropped:
Stream with 1-hour retention:
Hour 0: [msg1][msg2][msg3][msg4][msg5] ^off1 ^off2 ^off3 ^off4 ^off5
Hour 2: [msg4][msg5][msg6] ^off4 ^off5 ^off6
Reading with off1 returns: 410 GoneWhen reading a dropped offset:
GET /stream?offset=off1HTTP/1.1 410 GoneContent-Type: application/json
{"error": "Offset before retention window", "earliest_offset": "off4"}Clients should handle 410 Gone by:
- Using the earliest available offset (if provided)
- Starting from the current tail
- Alerting the user about data loss
Implementation Guidelines
For Clients
- Never parse offsets - Treat them as opaque strings
- Always persist - Save offsets for resumability
- Handle 410 Gone - Gracefully recover from retention gaps
- URL encode - Properly encode when using in URLs
For Servers
- Lexicographic ordering - Ensure string comparison reflects stream order
- URL-safe characters - Prefer characters that don’t require encoding
- Consistent format - Don’t change offset format unexpectedly
- Include in responses - Always return
Stream-Next-Offset
Example Offset Formats
Different implementations use different formats. All are valid:
| Implementation | Format | Example |
|---|---|---|
| Byte position | {seq}_{byte} | 0000000000000000_0000000000000042 |
| Timestamp | {timestamp} | 1703001234567 |
| Logical | {segment}:{offset} | seg_007:1024 |
| UUID-based | {uuid} | 550e8400-e29b-41d4-a716-446655440000 |
| Encoded | {base64} | eyJzIjozLCJwIjo1MDB9 |
Next Steps
- Reading Operations - How reads use offsets
- Writing Operations - Offset updates on append
- Caching - Caching based on offsets