Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

Deploy to Baseten

GPT-J Truss

This is an implementation of EleutherAI GPT-J-6B. The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3.

Deploying GPT-J

First, clone this repository:

git clone https://github.com/basetenlabs/truss-examples/
cd stable-diffusion-truss

Before deployment:

  1. Make sure you have a Baseten account and API key.
  2. Install the latest version of Truss: pip install --upgrade truss

With stable-diffusion-truss as your working directory, you can deploy the model with:

truss push

Paste your Baseten API key if prompted.

For more information, see Truss documentation.

GPT-J API documentation

Input

The input should be a list of dictionaries and must contain the following key:

  • prompt - the prompt for text generation

Additionally; the following optional parameters are supported as pass thru to the generate method. For more details, see the official documentation

  • max_length - int - limited to 512
  • min_length - int - limited to 64
  • do_sample - bool
  • early_stopping - bool
  • num_beams - int
  • temperature - float
  • top_k - int
  • top_p - float
  • repetition_penalty - float
  • length_penalty - float
  • encoder_no_repeat_ngram_size - int
  • num_return_sequences - int
  • max_time - float
  • num_beam_groups - int
  • diversity_penalty - float
  • remove_invalid_values - bool

Here's an example input:

{
    "prompt": "If I was a billionaire, I would",
    "max_length": 50
}

Output

The result will be a dictionary containing:

  • status - either success or failed
  • data - the output text
  • message - will contain details in the case of errors
{"status": "success", "data": "If I was a billionaire, I would buy a plane.", "message": null}

Example usage

truss predict -d '{"prompt": "If I was a billionaire, I would"}'

You can also invoke this model on Baseten with the following cURL command (just fill in the model version ID and API Key):

 curl -X POST https://app.baseten.co/models/{MODEL_VERSION_ID}/predict \
  -H 'Authorization: Api-Key {YOUR_API_KEY}' \
  -d '{
    "prompt": "If I was a billionaire, I would",
    "max_length": 50
}'