Skip to content

fovi-llc/lattice

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LATTICE: LLM-guided Hierarchical Retrieval

arXiv Colab GitHub license Blog

LATTICE turns retrieval into an LLM-driven navigation problem over a semantic scaffold for computational tractability needed for large corpora.

Overview

LATTICE proposes an LLM-native retrieval paradigm that combines the efficiency of hierarchical search with the reasoning power of modern large language models. Instead of relying on a static retriever + reranker pipeline or attempting to place a large corpus directly in an LLM context, LATTICE organizes the corpus into a semantic tree and uses an LLM as an active search agent that navigates that tree. This design yields logarithmic search complexity while preserving the LLM’s ability to perform nuanced, multi-step relevance judgments for complex, reasoning-heavy queries.

Read more in the blog / paper or try it out in the colab notebook

🚀 Usage

Setup

  1. Clone the repository:

    git clone https://github.com/nilesh2797/lattice
    cd lattice
    mkdir results trees
  2. Install dependencies:

    pip install -r src/requirements.txt
  3. Download pre-built semantic trees:

    git clone https://huggingface.co/datasets/quicktensor/lattice-bright-trees ./trees/BRIGHT
  4. Set up API credentials:

    export GOOGLE_API_KEY=your_api_key_here

Quick Start

Run a single experiment:

cd src; python run.py --subset biology --tree_version bottom-up --num_iters 20

Batch Experiments

cd src; bash run.sh

Configuration

Parameter Description Default
--subset Dataset subset (biology, economics, etc.) Required
--tree_version Tree construction method (bottom-up/top-down) Required
--num_iters Number of retrieval iterations 20
--max_beam_size Beam size during traversal 2
--relevance_chain_factor Weight for current score in path relevance 0.5
--reasoning_in_traversal_prompt Enable reasoning (thinking budget) -1 (enabled)
--rerank Additional reranking on final results False
--load_existing Resume from checkpoint defined by hyperparams False
--suffix Experiment name suffix -

For a complete list, see src/hyperparams.py.

Project Structure

lattice/release/
├── src/
│   ├── run.py              # Main execution script
│   ├── run.sh              # Batch execution wrapper
|   ├── run.ipynb           # Jupyter notebook for running / debugging experiments
│   ├── hyperparams.py      # Hyperparameter definitions
│   ├── tree_objects.py     # Semantic tree and sample objects
│   ├── llm_apis.py         # LLM API wrappers
│   ├── prompts.py          # Prompt templates
│   ├── utils.py            # Utility functions
│   └── calib_utils.py      # Calibration utilities
├── trees/
│   └── BRIGHT/             # Pre-built semantic trees
├── results/
│   └── BRIGHT/             # Experiment results
└── logs/                   # Execution logs

📈 Results

Ranking results on BRIGHT

Retrieval results & cost analysis on Stackexchange datasets from BRIGHT

📜 Citation

If you find this work helpful, please cite:

@article{gupta2025lattice,
  title={LLM-Guided Hierarchical Retrieval},
  author={Gupta, Nilesh and Chang, Wei-Cheng and Bui, Ngot and Hsieh, Cho-Jui and Dhillon, Inderjit S.},
  journal={arXiv preprint arXiv:2510.13217},
  year={2025}
}

About

LATTICE turns retrieval into an LLM-driven navigation problem over a semantic scaffold

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 90.8%
  • Python 8.9%
  • Shell 0.3%