GitHub - Ybakman/TruthTorchLM

TruthTorchLM: A Comprehensive Package for Assessing/Predicting Truthfulness in LLM Outputs (EMNLP - 2025)

Features

State-of-the-Art Methods: Offers more than 30 truth methods that are designed to assess/predict the truthfulness of LLM generations. These methods range from Google search check to uncertainty estimation and multi-LLM collaboration techniques.
Integration: Fully compatible with Huggingface and LiteLLM, enabling users to integrate truthfulness assessment/prediction into their workflows with minimal code changes.
Evaluation Tools: Benchmark truth methods using various metrics including AUROC, AUPRC, PRR, and Accuracy.
Calibration: Normalize and calibrate truth methods for interpretable and comparable outputs.
Long-Form Generation: Adapts truth methods to assess/predict truthfulness in long-form text generations effectively.
Extendability: Provides an intuitive interface for implementing new truth methods.

Installation

Create a new environment with python >=3.10:

conda create --name truthtorchlm python=3.10
conda activate truthtorchlm

Then, install TruthTorchLM using pip:

pip install TruthTorchLM

Or, alternatively

git clone https://github.com/Ybakman/TruthTorchLM.git
pip -r requirements.txt

Demo Video Available in Youtube

https://youtu.be/Bim-6Tv_qU4

Quick Start

Setting Up Credentials

import os
os.environ["OPENAI_API_KEY"] = 'your_open_ai_key'#to use openai models
os.environ['SERPER_API_KEY'] = 'your_serper_api_key'#for long form generation evaluation: https://serper.dev/

Setting Up a Model

You can define your model and tokenizer using Huggingface or specify an API-based model:

from transformers import AutoModelForCausalLM, AutoTokenizer
import TruthTorchLM as ttlm
import torch

# Huggingface model
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3-8B-Instruct", 
    torch_dtype=torch.bfloat16
).to('cuda:0')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct", use_fast=False)
# API model
api_model = "gpt-4o"

Generating Text with Truth Values

TruthTorchLM generates messages with a truth value, indicating whether the model output is truthful or not. Various methods (called truth methods) can be used for this purpose. Each method can have different algorithms and output ranges. Higher truth values generally suggest truthful outputs. This functionality is mostly useful for short-form QA:

# Define truth methods
lars = ttlm.truth_methods.LARS()
confidence = ttlm.truth_methods.Confidence()
self_detection = ttlm.truth_methods.SelfDetection(number_of_questions=5)
truth_methods = [lars, confidence, self_detection]

# Define a chat history
chat = [{"role": "system", "content": "You are a helpful assistant. Give short and precise answers."},
        {"role": "user", "content": "What is the capital city of France?"}]

# Generate text with truth values (Huggingface model)
output_hf_model = ttlm.generate_with_truth_value(
    model=model,
    tokenizer=tokenizer,
    messages=chat,
    truth_methods=truth_methods,
    max_new_tokens=100,
    temperature=0.7
)
# Generate text with truth values (API model)
output_api_model = ttlm.generate_with_truth_value(
    model=api_model,
    messages=chat,
    truth_methods=truth_methods
)

Calibrating Truth Methods

Truth values for different methods may not be directly comparable. Use the calibrate_truth_method function to normalize truth values to a common range for better interpretability. Note that normalized truth value in the output dictionary is meaningless without calibration.

model_judge = ttlm.evaluators.ModelJudge('gpt-4o-mini')

for truth_method in truth_methods:
    truth_method.set_normalizer(ttlm.normalizers.IsotonicRegression())

calibration_results = ttlm.calibrate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    tokenizer=tokenizer,
    correctness_evaluator=model_judge,
    size_of_data=1000,
    max_new_tokens=64
)

Evaluating Truth Methods

We can evaluate the truth methods with the evaluate_truth_method function. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR:

results = ttlm.evaluate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    eval_metrics=['auroc', 'prr'],
    tokenizer=tokenizer,
    size_of_data=1000,
    correctness_evaluator=model_judge,
    max_new_tokens=64
)

Truthfulness in Long-Form Generation

Assigning a single truth value for a long text is neither practical nor useful. TruthTorchLM first decomposes the generated text into short, single-sentence claims and assigns truth values to these claims using claim check methods. The long_form_generation_with_truth_value function returns the generated text, decomposed claims, and their truth values.

import TruthTorchLM.long_form_generation as LFG
from transformers import DebertaForSequenceClassification, DebertaTokenizer

#define a decomposition method that breaks the the long text into claims
decomposition_method = LFG.decomposition_methods.StructuredDecompositionAPI(model="gpt-4o-mini", decomposition_depth=1) #Utilize API models to decompose text
# decomposition_method = LFG.decomposition_methods.StructuredDecompositionLocal(model, tokenizer, decomposition_depth=1) #Utilize HF models to decompose text

#entailment model is used by some truth methods and claim check methods
model_for_entailment = DebertaForSequenceClassification.from_pretrained('microsoft/deberta-large-mnli').to('cuda:0')
tokenizer_for_entailment = DebertaTokenizer.from_pretrained('microsoft/deberta-large-mnli')

#define truth methods 
confidence = ttlm.truth_methods.Confidence()
lars = ttlm.truth_methods.LARS()

#define the claim check methods that applies truth methods
qa_generation = LFG.claim_check_methods.QuestionAnswerGeneration(model="gpt-4o-mini", tokenizer=None, num_questions=2, max_answer_trials=2,
                                                                     truth_methods=[confidence, lars], seed=0,
                                                                     entailment_model=model_for_entailment, entailment_tokenizer=tokenizer_for_entailment) #HF model and tokenizer can also be used, LM is used to generate question
#there are some claim check methods that are directly designed for this purpose, not utilizing truth methods
ac_entailment = LFG.claim_check_methods.AnswerClaimEntailment( model="gpt-4o-mini", tokenizer=None, 
                                                                      num_questions=3, num_answers_per_question=2, 
                                                                      entailment_model=model_for_entailment, entailment_tokenizer=tokenizer_for_entailment) #HF model and tokenizer can also be used, LM is used to generate question

#define a chat history
chat = [{"role": "system", "content": 'You are a helpful assistant. Give brief and precise answers.'},
        {"role": "user", "content": f'Who is Ryan Reynolds?'}]

#generate a message with a truth value, it's a wrapper fucntion for model.generate in Huggingface
output_hf_model = LFG.long_form_generation_with_truth_value(model=model, tokenizer=tokenizer, messages=chat, decomp_method=decomposition_method, 
                                          claim_check_methods=[qa_generation, ac_entailment], generation_seed=0)

#generate a message with a truth value, it's a wrapper fucntion for litellm.completion in litellm
output_api_model = LFG.long_form_generation_with_truth_value(model="gpt-4o-mini", messages=chat, decomp_method=decomposition_method, 
                                          claim_check_methods=[qa_generation, ac_entailment], generation_seed=0, seed=0)

Evaluation of Truth Methods in Long-Form Generation

We can evaluate truth methods on long-form generation by using evaluate_truth_method_long_form function. To obtain the correctness of the claims we follow SAFE paper. SAFE performs Google search for each claim and assigns labels as supported, unsupported, or irrelevant. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR.

#create safe object that assigns labels to the claims
safe = LFG.ClaimEvaluator(rater='gpt-4o-mini', tokenizer = None, max_steps = 5, max_retries = 10, num_searches = 3) 

#Define metrics
sample_level_eval_metrics = ['f1'] #calculate metric over the claims of a question, then average across all the questions
dataset_level_eval_metrics = ['auroc', 'prr'] #calculate the metric across all claims

results = LFG.evaluate_truth_method_long_form(dataset='longfact_objects', model='gpt-4o-mini', tokenizer=None,
                                sample_level_eval_metrics=sample_level_eval_metrics, dataset_level_eval_metrics=dataset_level_eval_metrics,
                                decomp_method=decomposition_method, claim_check_methods=[qa_generation],
                                claim_evaluator = safe, size_of_data=3,  previous_context=[{'role': 'system', 'content': 'You are a helpful assistant. Give precise answers.'}], 
                                user_prompt="Question: {question_context}", seed=41,  return_method_details = False, return_calim_eval_details=False, wandb_run = None,  
                                add_generation_prompt = True, continue_final_message = False)

Available Truth Methods

LARS: Do Not Design, Learn: A Trainable Scoring Function for Uncertainty Estimation in Generative LLMs.
Confidence: Uncertainty Estimation in Autoregressive Structured Prediction.
Entropy:Uncertainty Estimation in Autoregressive Structured Prediction.
SelfDetection: Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method.
AttentionScore: LLM-Check: Investigating Detection of Hallucinations in Large Language Models.
CrossExamination: LM vs LM: Detecting Factual Errors via Cross Examination.
EccentricityConfidence: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
EccentricityUncertainty: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
GoogleSearchCheck: FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios.
Inside: INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection.
KernelLanguageEntropy: Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities.
MARS: MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs.
MatrixDegreeConfidence: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
MatrixDegreeUncertainty: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
MiniCheck: MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
MultiLLMCollab: Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration.
NumSemanticSetUncertainty: Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation.
PTrue: Language Models (Mostly) Know What They Know.
Saplma: The Internal State of an LLM Knows When It’s Lying.
SemanticEntropy: Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation.
sentSAR: Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models.
SumEigenUncertainty: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
tokenSAR: Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models.
VerbalizedConfidence: Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback.
DirectionalEntailmentGraph: LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation

Contributors

Yavuz Faruk Bakman ([email protected])
Duygu Nur Yaldiz ([email protected])
Sungmin Kang ([email protected])
Alperen Ozis ([email protected])
Hayrettin Eren Yildiz ([email protected])
Mitash Shah ([email protected])

Citation

If you use TruthTorchLM in your research, please cite:

@inproceedings{yaldiz-etal-2025-truthtorchlm,
    title = "{T}ruth{T}orch{LM}: A Comprehensive Library for Predicting Truthfulness in {LLM} Outputs",
    author = {Yaldiz, Duygu Nur  and
      Bakman, Yavuz Faruk  and
      Kang, Sungmin  and
      {\"O}zi{\c{s}}, Alperen  and
      Yildiz, Hayrettin Eren  and
      Shah, Mitash Ashish  and
      Huang, Zhiqi  and
      Kumar, Anoop  and
      Samuel, Alfy  and
      Liu, Daben  and
      Karimireddy, Sai Praneeth  and
      Avestimehr, Salman},
    editor = {Habernal, Ivan  and
      Schulam, Peter  and
      Tiedemann, J{\"o}rg},
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-demos.54/",
    pages = "717--728",
    ISBN = "979-8-89176-334-0",
}

License

TruthTorchLM is released under the MIT License.

For inquiries or support, feel free to contact the maintainers.

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.github/workflows		.github/workflows
src/TruthTorchLM		src/TruthTorchLM
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt
setup.py		setup.py
ttlm_logo.png		ttlm_logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TruthTorchLM: A Comprehensive Package for Assessing/Predicting Truthfulness in LLM Outputs (EMNLP - 2025)

Features

Installation

Demo Video Available in Youtube

Quick Start

Setting Up Credentials

Setting Up a Model

Generating Text with Truth Values

Calibrating Truth Methods

Evaluating Truth Methods

Truthfulness in Long-Form Generation

Evaluation of Truth Methods in Long-Form Generation

Available Truth Methods

Contributors

Citation

License

About

Uh oh!

Releases 18

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TruthTorchLM: A Comprehensive Package for Assessing/Predicting Truthfulness in LLM Outputs (EMNLP - 2025)

Features

Installation

Demo Video Available in Youtube

Quick Start

Setting Up Credentials

Setting Up a Model

Generating Text with Truth Values

Calibrating Truth Methods

Evaluating Truth Methods

Truthfulness in Long-Form Generation

Evaluation of Truth Methods in Long-Form Generation

Available Truth Methods

Contributors

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 18

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages