UncensorBench is a benchmark to compare how different AI models censor responses and what their political leanings are. Currently it has 20 prompts to test the models bias and 49 to test the models censorship. Currently only a few models have been run on the benchmark more than once.
Pull requests and issues are welcome if you want to add more prompts or improve the benchmark.
Link to the benchmark: https://uncensor.btx.sh
bun install
cd /apps/benchEdit .env to set your OpenAI API key and OpenRouter API key.
Edit src/models.ts to set the models you want to benchmark.
Edit src/index.ts to set the concurrency for the benchmark (Default is 10 prompts and 2 models).
Note: this will by default run the benchmark on all models currently in the src/models.ts file which can be expensive and take a while to complete. It is recommended to run the benchmark on a subset of models at a time by commenting out the models you don't want to benchmark in src/models.ts.
bun run benchbun install
cd /apps/webEdit .env to set your Database URL and Auth Token for the add model endpoint.
bun run devbun run cf:previewbun run cf:deploy