Skip to content

kowshik24/fomi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FOMI

Frequency-Optimized Manifold Indexing (FOMI) is an experimental Python library for approximate nearest neighbor (ANN) search on image-like embedding vectors.

FOMI combines:

  • semantic routing (cluster-level pruning),
  • spectral product quantization (SPQ) in the DCT domain,
  • manifold-aware graph search (MAG) inside semantic partitions.

Status

This project is alpha and intended for research/prototyping workloads.

Features

  • Integrated ANN index via FOMIIndex
  • SPQ compression (SpectralProductQuantization) with band-aware coding
  • MAG search graph (ManifoldAwareGraph) with curvature-aware neighbor selection
  • Benchmarking utilities with optional FAISS comparison
  • Visualization helpers for memory/performance/manifold plots

Installation

From source

git clone <your-repo-url>
cd fomi
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Optional extras

# Dev tools
pip install -e ".[dev]"

# Visualization stack
pip install -e ".[viz]"

# FAISS benchmark support
pip install -e ".[faiss]"

Quick Start

import numpy as np
from fomi import FOMIIndex

# Sample vectors
vectors = np.random.randn(10000, 512).astype(np.float32)

# Build index
index = FOMIIndex(d=512, n_semantic_clusters=16)
index.build(vectors, verbose=True)

# Single-query search
query = np.random.randn(512).astype(np.float32)
neighbors = index.search(query, k=10, n_probe=2)
print(neighbors)

# Batch search
queries = np.random.randn(32, 512).astype(np.float32)
batch_neighbors = index.batch_search(queries, k=10, n_probe=2)

# Save / load
index.save("fomi_index.pkl")
loaded = FOMIIndex.load("fomi_index.pkl")

Core API

FOMIIndex

  • build(vectors, verbose=True)
  • search(query, k=10, n_probe=1, ef=50, use_exact_rerank=True)
  • batch_search(queries, k=10, n_probe=1, ef=50)
  • save(path) / load(path)
  • get_memory_usage() / get_index_statistics()

SpectralProductQuantization

  • train(vectors, n_iter=100, verbose=True)
  • encode(vectors)
  • decode(codes)
  • asymmetric_distance(queries, codes)
  • compute_reconstruction_error(vectors)

ManifoldAwareGraph

  • build(vectors, verbose=True)
  • search(vectors, query, k=10, ef=50)
  • batch_search(vectors, queries, k=10, ef=50)
  • save(path) / load(path)

Benchmarks

Run the benchmark script:

python3 scripts/run_benchmark.py --help
python3 scripts/run_benchmark.py --n-vectors 10000 --dimension 512 --k 100

Or use the programmatic evaluator:

from fomi.evaluation.benchmark import FOMIEvaluator

evaluator = FOMIEvaluator()
vectors, labels = evaluator.generate_test_data(n_vectors=5000, d=256)
results = evaluator.benchmark(vectors=vectors, labels=labels, k=100, include_faiss=False)
evaluator.print_summary()

Examples

  • examples/basic_usage.py
  • examples/benchmark_demo.py
  • examples/visualization_demo.py

Development

Local setup

pip install -r requirements-dev.txt

Run tests

python3 -m pytest -v

Formatting / lint

black .
isort .
flake8
mypy fomi

Project Layout

fomi/
  core/
    index.py
    spq.py
    mag.py
  clustering/
    semantic.py
  graph/
    curvature.py
  quantization/
    codebooks.py
  evaluation/
    benchmark.py
    metrics.py
  visualization/
    plots.py
  utils/
    data.py
    transforms.py

Notes

  • Security: model/index loading uses Python pickle. Only load files from trusted sources.
  • Reproducibility: many algorithms include randomized steps; fix seeds where needed in experiments.

License

MIT

About

FOMI: Frequency-Optimized Manifold Indexing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages