Popular repositories Loading
-
-
Snorkel-Wordle-Benchmark
Snorkel-Wordle-Benchmark PublicBenchmarking LLM performance in playing the game of Wordle.
-
pydantic
pydantic PublicForked from pydantic/pydantic
Data validation using Python type hints
Python 3
-
terminal-bench
terminal-bench PublicForked from harbor-framework/terminal-bench
A benchmark for LLMs on complicated tasks in the terminal
Python 3
-
Repositories
- Toolathlon Public Forked from hkust-nlp/Toolathlon
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
snorkel-ai/Toolathlon’s past year of commit activity - aws-assume-role-with-web-identity-buildkite-plugin Public Forked from buildkite-plugins/aws-assume-role-with-web-identity-buildkite-plugin
A Buildkite plugin to assume-role-with-web-identity using a Buildkite OIDC token before running the build command
snorkel-ai/aws-assume-role-with-web-identity-buildkite-plugin’s past year of commit activity - OpenEnv Public Forked from meta-pytorch/OpenEnv
An interface library for RL post training with environments.
snorkel-ai/OpenEnv’s past year of commit activity - UI-Elements-Visualizer Public
snorkel-ai/UI-Elements-Visualizer’s past year of commit activity - mai-banking-hr-data-viewer Public
snorkel-ai/mai-banking-hr-data-viewer’s past year of commit activity - UI-Elements-Training Public
snorkel-ai/UI-Elements-Training’s past year of commit activity - telegraf Public Forked from influxdata/telegraf
The plugin-driven server agent for collecting & reporting metrics.
snorkel-ai/telegraf’s past year of commit activity - influxdb Public Forked from influxdata/influxdb
Scalable datastore for metrics, events, and real-time analytics
snorkel-ai/influxdb’s past year of commit activity - terminal-bench Public Forked from harbor-framework/terminal-bench
A benchmark for LLMs on complicated tasks in the terminal
snorkel-ai/terminal-bench’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…