Row ID and Lineage Specification¶
Overview¶
Lance provides row identification and lineage tracking capabilities. Row addressing enables efficient random access to rows within the table through a physical location encoding. Stable row IDs provide persistent identifiers that remain constant throughout a row's lifetime, even as its physical location changes. Row version tracking records when rows were created and last modified, enabling incremental processing, change data capture, and time-travel queries.
Row ID Styles¶
Lance uses two different styles of row IDs:
Row Address¶
Row address is the physical location of a row in the table, represented as a 64-bit identifier composed of two 32-bit values:
This addressing scheme enables efficient random access: given a row address, the fragment and offset are extracted with bit operations. Row addresses change when data is reorganized through compaction or updates.
Row address is currently the primary form of identifier used for indexing purposes. Secondary indices (vector indices, scalar indices, full-text search indices) reference rows by their row addresses.
Note
Work to support stable row IDs in indices is in progress.
Stable Row ID¶
Stable Row ID is a unique auto-incrementing u64 identifier assigned to each row that remains constant throughout the row's lifetime, even when the row's physical location (row address) changes. See the next section for more details.
Warning
Historically, "row ID" was used to mean row address interchangeably. With the introduction of stable row IDs, there could be places in code and documentation that mix the terms "row ID" and "row address" or "row ID" and "stable row ID". Please raise a PR if you find any place incorrect or confusing.
Stable Row ID¶
Row ID Assignment¶
Row IDs are assigned using a monotonically increasing next_row_id counter stored in the manifest.
Assignment Protocol:
- Writer reads the current
next_row_idfrom the manifest at the read version - Writer assigns row IDs sequentially starting from
next_row_idfor new rows - Writer updates
next_row_idin the new manifest tonext_row_id + num_new_rows - If commit fails due to conflict, writer rebases:
- Re-reads the new
next_row_idfrom the latest version - Reassigns row IDs to new rows using the updated counter
- Retries commit
This protocol mirrors fragment ID assignment and ensures row IDs are unique across all table versions.
Row ID Behavior on Updates¶
When a row is updated, it is typically assigned a new row ID rather than reusing the old one. This avoids the complexity of updating secondary indices that may reference the old values.
Update Workflow:
- Original row with ID
Rexists at address(F1, O1) - Update operation creates new row with ID
R'at address(F2, O2) - Deletion vector marks row ID
Ras deleted in fragmentF1 - Secondary indices referencing old row ID
Rare invalidated through fragment bitmap updates - New row ID
R'requires index rebuild for affected columns
This approach ensures secondary indices do not reference stale data.
Row ID Sequences¶
Storage Format¶
Row ID sequences are stored using the RowIdSequence protobuf message.
The sequence is partitioned into segments, each encoded optimally based on the data pattern.
Segment Encodings¶
Each segment uses one of five encodings optimized for different data patterns:
Range (Contiguous Values)¶
For sorted, contiguous values with no gaps.
Example: Row IDs [100, 101, 102, 103, 104] → Range{start: 100, end: 105}.
Used for new fragments where row IDs are assigned sequentially.
Range protobuf message
Range with Holes (Sparse Deletions)¶
For sorted values with few gaps.
Example: Row IDs [100, 101, 103, 104] (missing 102) → RangeWithHoles{start: 100, end: 105, holes: [102]}.
Used for fragments with sparse deletions where maintaining the range is efficient.
RangeWithHoles protobuf message
message RangeWithHoles {
/// The start of the range, inclusive.
uint64 start = 1;
/// The end of the range, exclusive.
uint64 end = 2;
/// The holes in the range, as a sorted array of values;
/// Binary search can be used to check whether a value is a hole and should
/// be skipped. This can also be used to count the number of holes before a
/// given value, if you need to find the logical offset of a value in the
/// segment.
EncodedU64Array holes = 3;
}
Range with Bitmap (Dense Deletions)¶
For sorted values with many gaps. The bitmap encodes 8 values per byte, with the most significant bit representing the first value. Used for fragments with dense deletion patterns.
RangeWithBitmap protobuf message
message RangeWithBitmap {
/// The start of the range, inclusive.
uint64 start = 1;
/// The end of the range, exclusive.
uint64 end = 2;
/// A bitmap of the values in the range. The bitmap is a sequence of bytes,
/// where each byte represents 8 values. The first byte represents values
/// start to start + 7, the second byte represents values start + 8 to
/// start + 15, and so on. The most significant bit of each byte represents
/// the first value in the range, and the least significant bit represents
/// the last value in the range. If the bit is set, the value is in the
/// range; if it is not set, the value is not in the range.
bytes bitmap = 3;
}
Sorted Array (Sparse Values)¶
For sorted but non-contiguous values, stored as an EncodedU64Array.
Used for merged fragments or fragments after compaction.
Unsorted Array (General Case)¶
For unsorted values, stored as an EncodedU64Array.
Rare; most operations maintain sorted order.
Encoded U64 Arrays¶
The EncodedU64Array message supports bitpacked encoding to minimize storage.
The implementation selects the most compact encoding based on the value range, choosing between base + 16-bit offsets, base + 32-bit offsets, or full 64-bit values.
EncodedU64Array protobuf message
message EncodedU64Array {
message U16Array {
uint64 base = 1;
/// The deltas are stored as 16-bit unsigned integers.
/// (protobuf doesn't support 16-bit integers, so we use bytes instead)
bytes offsets = 2;
}
message U32Array {
uint64 base = 1;
/// The deltas are stored as 32-bit unsigned integers.
/// (we use bytes instead of uint32 to avoid overhead of varint encoding)
bytes offsets = 2;
}
message U64Array {
/// (We use bytes instead of uint64 to avoid overhead of varint encoding)
bytes values = 2;
}
oneof array {
U16Array u16_array = 1;
U32Array u32_array = 2;
U64Array u64_array = 3;
}
}
Inline vs External Storage¶
Row ID sequences are stored either inline in the fragment metadata or in external files. Sequences smaller than ~200KB are stored inline to avoid additional I/O, while larger sequences are written to external files referenced by path and offset. This threshold balances manifest size against the overhead of separate file reads.
DataFragment row_id_sequence field
Row ID Index¶
Construction¶
The row ID index is built at table load time by aggregating row ID sequences from all fragments:
For each fragment F with ID f:
For each (position p, row_id r) in F.row_id_sequence:
index[r] = (f, p)
This creates a mapping from row ID to current row address.
Index Invalidation with Updates¶
When rows are updated, the row ID index must account for stale mappings:
Example Scenario:
- Initial state: Fragment 1 contains rows with IDs
[1, 2, 3]at offsets[0, 1, 2] - Update operation modifies row 2:
- New fragment 2 created with row ID
4(new ID assigned) - Deletion vector marks row ID
2as deleted in fragment 1 - Row ID index:
1 → (1, 0)✓ Valid2 → (1, 1)✗ Invalid (deleted)3 → (1, 2)✓ Valid4 → (2, 0)✓ Valid (new row)
Fragment Bitmaps for Index Masking¶
Secondary indices use fragment bitmaps to track which row IDs remain valid:
Without Row ID Updates:
String Index on column "str":
Fragment Bitmap: {1, 2} (covers fragments 1 and 2)
All indexed row IDs are valid
With Row ID Updates:
Vector Index on column "vec":
Fragment Bitmap: {1} (only fragment 1)
Row ID 2 was updated, so index entry for ID 2 is stale
Index query filters out ID 2 using deletion vectors
This bitmap-based approach allows indices to remain immutable while accounting for row modifications.
Row Version Tracking¶
Created At Version¶
Each row tracks the version at which it was created. The sequence uses run-length encoding for efficient storage, where each run specifies a span of consecutive rows and the version they were created in.
Example: Fragment with 1000 rows created in version 5:
RowDatasetVersionSequence {
runs: [
RowDatasetVersionRun { span: Range{start: 0, end: 1000}, version: 5 }
]
}
DataFragment created_at_version_sequence field
RowDatasetVersionSequence protobuf messages
Last Updated At Version¶
Each row tracks the version at which it was last modified.
When a row is created, last_updated_at_version equals created_at_version.
When a row is updated, a new row is created with both created_at_version and last_updated_at_version set to the current version, and the old row is marked deleted.
Example: Row created in version 3, updated in version 7:
Old row (marked deleted):
created_at_version: 3
last_updated_at_version: 3
New row:
created_at_version: 7
last_updated_at_version: 7
DataFragment last_updated_at_version_sequence field
Change Data Feed¶
Lance supports querying rows that changed between versions through version tracking columns.
These queries can be expressed as standard SQL predicates on the _row_created_at_version and _row_last_updated_at_version columns.
Inserted Rows¶
Rows created between two versions can be retrieved by filtering on _row_created_at_version:
SELECT * FROM dataset
WHERE _row_created_at_version > {begin_version}
AND _row_created_at_version <= {end_version}
This query returns all rows inserted in the specified version range, including the version metadata columns _row_created_at_version, _row_last_updated_at_version, and _rowid.
Updated Rows¶
Rows modified (but not newly created) between two versions can be retrieved by combining filters on both version columns:
SELECT * FROM dataset
WHERE _row_created_at_version <= {begin_version}
AND _row_last_updated_at_version > {begin_version}
AND _row_last_updated_at_version <= {end_version}
This query excludes newly inserted rows by requiring _row_created_at_version <= {begin_version}, ensuring only pre-existing rows that were subsequently updated are returned.