Skip to content

iguanesolutions/kimi-rp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kimi-rp

Kimi Reverse Proxy is a lightweight HTTP reverse proxy that automatically adjusts sampling parameters (temperature, top_p) based on whether a thinking or non-thinking model is being used. It sits between your application and the backend LLM server (e.g., vLLM).

Installation

Requirements: Go 1.24.2 or later

go build -o kimi-rp .

Configuration

Configure the proxy using command-line flags or environment variables:

Flag Environment Variable Default Description
-listen KIMIRP_LISTEN 0.0.0.0 IP address to listen on
-port KIMIRP_PORT 9000 Port to listen on
-target KIMIRP_TARGET http://127.0.0.1:8000 Backend target URL
-loglevel KIMIRP_LOGLEVEL INFO Log level (DEBUG, INFO, WARN, ERROR)
-thinking-model KIMIRP_THINKING_MODEL_NAME (required) Name of the thinking model
-no-thinking-model KIMIRP_NO_THINKING_MODEL_NAME (required) Name of the non-thinking model

How It Works

  1. Client sends a request with a model name in the request body
  2. Proxy inspects the model field to determine if it's a thinking or non-thinking model
  3. Proxy sets appropriate sampling parameters:
    • If thinking model: temperature=1.0, top_p=0.95, extra_body.thinking=true
    • If non-thinking model: temperature=0.6, top_p=0.95, extra_body.thinking=false
  4. Request is forwarded to the backend server
  5. Response is streamed back to the client

License

MIT License - see LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors