Name		Name	Last commit message	Last commit date
parent directory ..
agent_evals		agent_evals
ai_api_calls		ai_api_calls
asserts		asserts
data		data
funnels/sql		funnels/sql
prompts		prompts
reporting		reporting
test_cases		test_cases
test_runners		test_runners
.env.sample		.env.sample
README.md		README.md
__init__.py		__init__.py
eval_types.py		eval_types.py
main.py		main.py
mypy.ini		mypy.ini
notebook_states.py		notebook_states.py
requirements.txt		requirements.txt
utils.py		utils.py

README.md

AI Evals

Setting up evals

Create a new virtual environment

python -m venv venv

Activate the virtual environment:

source venv/bin/activate

Install the dependencies:

pip install -r requirements.txt

If you want to run the sql tests, create the env file by copying the sample file:

cp .env.sample .env

Then update the new .env file with your Snowflake credentials.

Running the tests

Navigate to the mito folder.
To run the chat tests, run the command:

python -m evals.main --test_type=chat

To run the inline_code_completion tests, run the command:

python -m evals.main --test_type=inline_code_completion

To run the smart-debugger tests, run the command:

python -m evals.main --test_type=smart_debug

To run the sql tests, run the command:

python -m evals.main --test_type=sql

Running specific tests

To specify which tests to run, set some of the following flags:

--test_type
--test
--prompt
--tags
--model

For example, to run all tests for the single_shot_prompt prompt, run:

python -m evals.main --test_type=chat --prompt=single_shot_prompt