Skip to content

DocIndex: User defined mapping from python type to db type #1190

@JohannesMessner

Description

@JohannesMessner

When a user creates a Document Index with a schema, it is not always unambiguous what types should be used in the database:

class MySchema(BaseDocument):
    text: str  # should this be a string or a varchar? if varchar, what length?
    embedding: NdArray  # should this be float tensor or boolean tensor?
    num: float  # float32 or float64?

index = MyDocIndex[MySchema]

Right now, there is the method python_type_to_db_type that disambiguates these things, but it leaves no user choice.

We should enable an (optional!) feature like this:

class MySchema(BaseDocument):
    text: str  = Field(..., col_type='varchar', max_len=2048)
    embedding: NdArray  = Field(col_type='boolean_tensor')
    num: float  = Field(col_type='float64')

index = MyDocIndex[MySchema]

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions