Skip to content

TheTokenCompany/tokenc-python-sdk

Repository files navigation

tokenc - Python SDK

Compress LLM prompts to reduce costs and latency. 100K tokens compressed in ~85ms.

Install

pip install tokenc

Usage

from tokenc import TokenClient

client = TokenClient(api_key="your-api-key")

result = client.compress_input(
    input="Your long prompt here...",
    model="bear-1.2",  # or "bear-1.1", "bear-1", etc.
    aggressiveness=0.5  # 0.1 = light, 0.5 = balanced, 0.9 = aggressive
)

print(result.output)           # compressed text
print(result.tokens_saved)     # tokens removed
print(result.compression_ratio) # e.g. 1.8x

Protected Content

Wrap text in <ttc_safe> tags to exclude it from compression:

result = client.compress_input(
    input="Compress this but <ttc_safe>keep this exactly as is</ttc_safe>.",
    model="bear-1.2",
    aggressiveness=0.7
)

Context Manager

with TokenClient(api_key="your-api-key") as client:
    result = client.compress_input(input="Your text...", model="bear-1.2", aggressiveness=0.5)

Performance

Requests are gzip-compressed and use HTTP keep-alive automatically.

Input Size E2E Latency Throughput
10K tokens 38ms 198K tok/s
100K tokens 85ms 975K tok/s
1M tokens 542ms 1.5M tok/s

Error Handling

from tokenc import TokenClient, AuthenticationError, RateLimitError, APIError

try:
    result = client.compress_input(input="Your text...", model="bear-1.2")
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limit exceeded")
except APIError as e:
    print(f"API error: {e}")

Links

About

The Token Company SDK for Python, pip install tokenc

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors