Lance Iceberg REST Catalog Implementation Spec¶
This document describes how the Apache Iceberg REST Catalog implements the Lance Namespace client spec.
Background¶
Apache Iceberg REST Catalog is a standardized REST API for interacting with Iceberg catalogs. It provides a vendor-neutral interface for managing tables and namespaces across different catalog backends. When registering a Lance table, the implementation creates a companion Iceberg table with a dummy schema at the same location, using table properties to identify it as a Lance table. For details on the Iceberg REST Catalog, see the Iceberg REST Catalog Specification.
Namespace Implementation Configuration Properties¶
The Lance Iceberg REST Catalog namespace implementation accepts the following configuration properties:
The endpoint property is required and specifies the Iceberg REST Catalog server endpoint URL (e.g., http://localhost:8181). Must start with http:// or https://.
The warehouse property is optional and specifies the warehouse identifier to use. Some Iceberg REST implementations require this.
The prefix property is optional and specifies the API path prefix (e.g., v1). Default value is empty.
The auth_token property is optional and specifies the bearer token for authentication.
The credential property is optional and specifies the OAuth2 client credential in the format client_id:client_secret for client credentials authentication flow.
The connect_timeout property is optional and specifies the connection timeout in milliseconds. Default value is 10000 (10 seconds).
The read_timeout property is optional and specifies the read timeout in milliseconds. Default value is 30000 (30 seconds).
The max_retries property is optional and specifies the maximum number of retries for failed requests. Default value is 3.
The root property is optional and specifies the default storage root location for tables. Default value is the current working directory.
Object Mapping¶
Namespace¶
The root namespace is represented by the Iceberg catalog root, accessed via the /namespaces endpoint.
A child namespace is a nested namespace in Iceberg. Iceberg supports arbitrary nesting depth using an array of strings (e.g., ["level1", "level2", "level3"]).
The namespace identifier is constructed by joining namespace levels with the \x1F (unit separator) character for API calls. In user-facing contexts, a . delimiter is used.
Namespace properties are stored in the namespace's properties map, returned by the Iceberg namespace API.
Table¶
A table is represented as a regular Iceberg table with a dummy schema. The dummy schema contains a single nullable string column named dummy. This approach ensures compatibility with the Iceberg REST Catalog API while storing the actual Lance table at the same location.
The table identifier is constructed by joining the namespace path and table name.
The table location is stored in the location field of the Iceberg table metadata, pointing to the root location of the Lance table.
Table properties are stored in the Iceberg table's properties map.
Lance Table Identification¶
A table in Iceberg REST Catalog is identified as a Lance table when the properties map contains a key table_type with value lance (case insensitive). The location must point to a valid Lance table root directory. The Iceberg table itself serves as a metadata wrapper, with the actual data stored in Lance format.
Basic Operations¶
CreateNamespace¶
Creates a new namespace in the Iceberg catalog.
The implementation:
- Parse the namespace identifier to get the namespace array
- Construct a CreateNamespaceRequest with the namespace array and properties
- POST to
/v1/{prefix}/namespacesendpoint - Return the created namespace properties
Error Handling:
If the namespace already exists, return error code 2 (NamespaceAlreadyExists). If the parent namespace does not exist, return error code 1 (NamespaceNotFound). If the server returns an error, return error code 18 (Internal).
ListNamespaces¶
Lists child namespaces under a given parent namespace.
The implementation:
- Parse the parent namespace identifier
- GET
/v1/{prefix}/namespaceswithparentquery parameter - Extract namespace names from the response
Error Handling:
If the parent namespace does not exist, return error code 1 (NamespaceNotFound). If the server returns an error, return error code 18 (Internal).
DescribeNamespace¶
Retrieves properties and metadata for a namespace.
The implementation:
- Parse the namespace identifier
- GET
/v1/{prefix}/namespaces/{namespace}with URL-encoded namespace path - Return the namespace properties
Error Handling:
If the namespace does not exist, return error code 1 (NamespaceNotFound). If the server returns an error, return error code 18 (Internal).
DropNamespace¶
Removes a namespace from the Iceberg catalog.
The implementation:
- Parse the namespace identifier
- DELETE
/v1/{prefix}/namespaces/{namespace}with URL-encoded namespace path
Error Handling:
If the namespace does not exist, return error code 1 (NamespaceNotFound). If the namespace is not empty, return error code 3 (NamespaceNotEmpty). If the server returns an error, return error code 18 (Internal).
DeclareTable¶
Declares a new Lance table in the Iceberg catalog without creating the underlying data.
The implementation:
- Parse the table identifier to extract namespace and table name
- Construct a CreateTableRequest with:
name: the table namelocation: the specified or default locationschema: a dummy Iceberg schema with a single nullable string columndummyproperties: table properties includingtable_type=lance
- POST to
/v1/{prefix}/namespaces/{namespace}/tables - Return the created table location and properties
Error Handling:
If the parent namespace does not exist, return error code 1 (NamespaceNotFound). If the table already exists, return error code 5 (TableAlreadyExists). If the server returns an error, return error code 18 (Internal).
ListTables¶
Lists all Lance tables in a namespace.
The implementation:
- Parse the namespace identifier
- GET
/v1/{prefix}/namespaces/{namespace}/tables - For each table, load its metadata and filter tables where
properties.table_type=lance - Extract table names from the response identifiers
Error Handling:
If the namespace does not exist, return error code 1 (NamespaceNotFound). If the server returns an error, return error code 18 (Internal).
DescribeTable¶
Retrieves metadata for a Lance table.
The implementation:
- Parse the table identifier to extract namespace and table name
- GET
/v1/{prefix}/namespaces/{namespace}/tables/{table} - Verify the table has
table_type=lanceproperty - Return the table location and properties
Error Handling:
If the table does not exist, return error code 4 (TableNotFound). If the table is not a Lance table, return error code 13 (InvalidInput). If the server returns an error, return error code 18 (Internal).
DeregisterTable¶
Removes a Lance table registration from the Iceberg catalog without deleting the underlying data.
The implementation:
- Parse the table identifier to extract namespace and table name
- DELETE
/v1/{prefix}/namespaces/{namespace}/tables/{table}?purgeRequested=false
Error Handling:
If the table does not exist, return error code 4 (TableNotFound). If the server returns an error, return error code 18 (Internal).