Skip to content

Offsets

Offsets are tokens that identify positions within a stream. They enable the core resumability feature of the protocol.

Overview

An offset represents a position in the stream after a specific byte. Clients use offsets to:

  1. Resume reading after disconnection
  2. Track progress through a stream
  3. Coordinate multiple readers
Stream: [ message 1 ][ message 2 ][ message 3 ]
Offsets: ^offset_0 ^offset_1 ^offset_2 ^offset_3 (tail)

Properties

Offsets have three critical properties that clients can rely on:

1. Opaque

Clients MUST NOT interpret offset structure or meaning.

# These are all valid offsets from different implementations:
0000000000000000_0000000000000042
abc123def456
chunk_007_byte_1024
{"segment":3,"position":500}

The internal format is implementation-defined and may change without notice.

2. Lexicographically Sortable

For any two offsets from the same stream, comparing them as strings determines their order:

// JavaScript example
const offset1 = "0000000000000000_0000000000000042"
const offset2 = "0000000000000000_0000000000000100"
offset1 < offset2 // true - offset1 is earlier in the stream

This property enables clients to:

  • Determine if they’ve seen data before
  • Merge data from multiple reads
  • Implement exactly-once processing

3. Persistent

Offsets remain valid for the lifetime of the stream:

  • Valid until stream is deleted
  • Valid until stream expires
  • Valid even if data before the offset is dropped (retention)

Using Offsets

Initial Read

To read from the beginning, use offset -1 or omit the parameter:

GET /stream?offset=-1
# or
GET /stream

Subsequent Reads

Use the Stream-Next-Offset header from the previous response:

# First read
GET /stream?offset=-1
# Response includes: Stream-Next-Offset: abc123
# Second read
GET /stream?offset=abc123
# Response includes: Stream-Next-Offset: def456
# Third read
GET /stream?offset=def456

Persistence

Clients SHOULD persist offsets to enable resumability:

// Browser - localStorage
localStorage.setItem(
"stream_offset",
response.headers.get("Stream-Next-Offset")
)
// Later, resume from saved offset
const offset = localStorage.getItem("stream_offset") || "-1"
fetch(`/stream?offset=${encodeURIComponent(offset)}`)
# Python - file storage
def save_offset(offset):
with open('.stream_offset', 'w') as f:
f.write(offset)
def load_offset():
try:
with open('.stream_offset', 'r') as f:
return f.read().strip()
except FileNotFoundError:
return '-1'

Special Values

ValueMeaningUsage
-1Stream startRead from beginning
OmittedSame as -1Default behavior
{token}After positionResume from here

URL Encoding

Offsets may contain characters that require URL encoding. Clients MUST properly encode offset values:

// JavaScript
const offset = "abc/123+456"
const url = `/stream?offset=${encodeURIComponent(offset)}`
// Result: /stream?offset=abc%2F123%2B456

Character Restrictions

Servers SHOULD use offsets that avoid these characters:

  • , (comma)
  • & (ampersand)
  • = (equals)
  • ? (question mark)

This prevents conflicts with URL query parameter syntax.

Offset Comparison

Clients MAY compare offsets to determine ordering:

def is_newer(offset_a, offset_b):
"""Returns True if offset_a is after offset_b in the stream."""
return offset_a > offset_b # Lexicographic comparison
# Example usage
current = "0000000000000000_0000000000000100"
received = "0000000000000000_0000000000000150"
if is_newer(received, current):
process_data()
current = received

Retention and Gone Offsets

When a stream has retention policies, old data may be dropped:

Stream with 1-hour retention:
Hour 0: [msg1][msg2][msg3][msg4][msg5]
^off1 ^off2 ^off3 ^off4 ^off5
Hour 2: [msg4][msg5][msg6]
^off4 ^off5 ^off6
Reading with off1 returns: 410 Gone

When reading a dropped offset:

GET /stream?offset=off1
HTTP/1.1 410 Gone
Content-Type: application/json
{"error": "Offset before retention window", "earliest_offset": "off4"}

Clients should handle 410 Gone by:

  1. Using the earliest available offset (if provided)
  2. Starting from the current tail
  3. Alerting the user about data loss

Implementation Guidelines

For Clients

  1. Never parse offsets - Treat them as opaque strings
  2. Always persist - Save offsets for resumability
  3. Handle 410 Gone - Gracefully recover from retention gaps
  4. URL encode - Properly encode when using in URLs

For Servers

  1. Lexicographic ordering - Ensure string comparison reflects stream order
  2. URL-safe characters - Prefer characters that don’t require encoding
  3. Consistent format - Don’t change offset format unexpectedly
  4. Include in responses - Always return Stream-Next-Offset

Example Offset Formats

Different implementations use different formats. All are valid:

ImplementationFormatExample
Byte position{seq}_{byte}0000000000000000_0000000000000042
Timestamp{timestamp}1703001234567
Logical{segment}:{offset}seg_007:1024
UUID-based{uuid}550e8400-e29b-41d4-a716-446655440000
Encoded{base64}eyJzIjozLCJwIjo1MDB9

Next Steps