Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

Replic Code 1.3B Truss

This repository packages Replit Code 1.3B as a Truss.

Replit Code 1.3B is an LLM released by Replit, optimized and trained for generating code autocompletions.

Deploying Replit Code 1.3B

First, clone this repository:

git clone https://github.com/basetenlabs/truss-examples/
cd replit-code-1.3b-truss

Before deployment:

  1. Make sure you have a Baseten account and API key.
  2. Install the latest version of Truss: pip install --upgrade truss

With replit-code-1.3b-truss as your working directory, you can deploy the model with:

truss push

Paste your Baseten API key if prompted.

For more information, see Truss documentation.

Hardware notes

We found this model runs reasonably fast on A10Gs; you can configure the hardware you'd like in the config.yaml.

...
resources:
  cpu: "3"
  memory: 14Gi
  use_gpu: true
  accelerator: A10G
...

Before deployment:

  1. Make sure you have a Baseten account and API key. You can sign up for a Baseten account here.
  2. Install Truss and the Baseten Python client: pip install --upgrade baseten truss
  3. Authenticate your development environment with baseten login

Deploying the Truss is easy; simply load it and push from a Python script:

import baseten
import truss

replit_code_truss = truss.load('.')
baseten.deploy(replit_code_truss)

Invoking Replit-1.3B

The usual GPT-style parameters will pass right through to the inference point:

  • max_new_tokens (default: 64)
  • temperature (default: 0.5)
  • top_p (default: 0.9)
  • top_k (default: 0)
  • num_beams (default: 4)
  • do_sample (default: False)

Note that we recommend setting do_sample to True for best results, and increasing the max_new_tokens parameter to 200-300.

truss predict -d '{"prompt": "def fib(n):", "do_sample": True, "max_new_tokens": 300}'

You can also invoke your model via a REST API

curl -X POST " https://app.baseten.co/models/YOUR_MODEL_ID/predict" \
     -H "Content-Type: application/json" \
     -H 'Authorization: Api-Key {YOUR_API_KEY}' \
     -d '{
           "prompt": "def fib(n):",
           "do_sample": True,
           "max_new_tokens": 300,
         }'