A simple Rust-based harness for loading and running text-generation models using the rust-bert and tch (PyTorch) libraries.
- Rust toolchain (1.60+)
- libtorch installed or available on your system (see https://github.com/LaurentMazare/tch-rs)
# Clone this repository
git clone <repository-url>
cd rust_deepseek_test
# Build in release mode
cargo build --releaseRun the predefined sample prompts and log responses plus inference timings:
cargo run --release- Model: Edit
src/model_loader.rsto changemodel_nameor adjustTextGenerationConfig(e.g.,max_length,temperature, etc.) - Prompts: Modify the
test_inputsvector insrc/main.rsto add or change test questions.
├── Cargo.toml # Project & dependency definitions
├── codex.md # Detailed codex/context for the codebase
├── README.md # Overview and getting started
├── src
│ ├── main.rs # Entry point and benchmarking harness
│ ├── model_loader.rs # Model initialization logic
│ ├── text_generator.rs # Text generation wrapper
│ ├── disk_offloader.rs # Utilities to read binary shards from disk
│ └── utils.rs # Logging helpers
└── target # Build artifacts (generated)
- Integrate
disk_offloaderto manually stream model weights from custom locations. - Add CLI options (e.g., prompt input, model parameters) in
main.rs. - Enhance error handling or retry logic using
anyhowand the helpers insrc/utils.rs.
For more in-depth context on module responsibilities and design, see
codex.md.