Branch and Tag Specification¶
Overview¶
Lance supports branching and tagging for managing multiple independent version histories and creating named references to specific versions. Branches enable parallel development workflows, while tags provide stable named references for important versions.
Branching¶
Branch Name¶
Branch names must follow these validation rules:
- Cannot be empty
- Cannot start or end with
/ - Cannot contain consecutive
// - Cannot contain
..or\ - Segments must contain only alphanumeric characters,
.,-,_ - Cannot end with
.lock - Cannot be named
main(reserved for main branch)
Branch Metadata Path¶
Branch metadata is stored at _refs/branches/{branch-name}.json in the dataset root.
Since branch names support hierarchical naming with / characters, the / is URL-encoded as %2F in the filename to distinguish it from directory separators (e.g., bugfix/issue-123 becomes bugfix%2Fissue-123.json):
{dataset_root}/
_refs/
branches/
feature-a.json
bugfix%2Fissue-123.json # Note: '/' encoded as '%2F'
Branch Metadata File Format¶
Each branch metadata file is a JSON file with the following fields:
| JSON Key | Type | Optional | Description |
|---|---|---|---|
parent_branch |
string | Yes | Name of the branch this was created from. null indicates branched from main. |
parent_version |
number | Version number of the parent branch at the time this branch was created. | |
create_at |
number | Unix timestamp (seconds since epoch) when the branch was created. | |
manifest_size |
number | Size of the initial manifest file in bytes. |
Branch Dataset Layout¶
Each branch dataset is technically a shallow clone of the source dataset.
Branch datasets are organized using the tree/ directory at the dataset root:
{dataset_root}/
tree/
{branch_name}/
_versions/
*.manifest
_transactions/
*.txn
_deletions/
*.arrow
*.bin
_indices/
{UUID}/
index.idx
Named branches store their version-specific files under tree/{branch_name}/, resembling the GitHub branch path convention.
It uses the branch name as is to form the path,
which means / would create a logical subdirectory (e.g., bugfix/issue-123, feature/user-auth):
{dataset_root}/
tree/
feature-a/
_versions/
1.manifest
2.manifest
bugfix/
issue-123/
_versions/
1.manifest
Tagging¶
Tag Name¶
Tag names must follow these validation rules:
- Cannot be empty
- Must contain only alphanumeric characters,
.,-,_ - Cannot start or end with
. - Cannot end with
.lock - Cannot contain consecutive
..
Note that tag names do not support / characters, unlike branch names.
Tag Storage¶
Tags are stored as JSON files under _refs/tags/ at the dataset root:
Tags are always stored at the root dataset level, regardless of which branch they reference.
Tag File Format¶
Each tag file is a JSON file with the following fields:
| JSON Key | Type | Optional | Description |
|---|---|---|---|
branch |
string | Yes | Branch name being tagged. null or absent indicates main branch. |
version |
number | Version number being tagged within that branch. | |
manifest_size |
number | Size of the manifest file in bytes. Used for efficient manifest loading. |