Skip to content
@AlignmentResearch

FAR.AI

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Popular repositories Loading

  1. tuned-lens tuned-lens Public

    Tools for understanding how transformer predictions are built layer-by-layer

    Python 574 63

  2. go_attack go_attack Public

    Python 91 9

  3. vlmrm vlmrm Public

    Python 70 15

  4. gpt-4-novel-apis-attacks gpt-4-novel-apis-attacks Public

    23 1

  5. learned-planner learned-planner Public

    Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban

    Python 17 5

  6. scaling-poisoning scaling-poisoning Public

    Python 16 3

Repositories

Showing 10 of 59 repositories
  • impossiblebench Public Forked from safety-research/impossiblebench

    Official Inspect Implementation for "ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases"

    AlignmentResearch/impossiblebench’s past year of commit activity
    Python 0 MIT 6 0 0 Updated Mar 15, 2026
  • persona_vectors Public Forked from safety-research/persona_vectors

    Persona Vectors: Monitoring and Controlling Character Traits in Language Models

    AlignmentResearch/persona_vectors’s past year of commit activity
    Python 3 91 0 4 Updated Feb 20, 2026
  • obfuscation-atlas Public

    The Obfuscation Atlas: Mapping Where Honesty Emerges in RLVR with Deception Probes

    AlignmentResearch/obfuscation-atlas’s past year of commit activity
    Python 5 1 0 0 Updated Feb 19, 2026
  • AlignmentResearch/AttemptPersuadeEval’s past year of commit activity
    Python 12 Apache-2.0 2 2 1 Updated Jan 30, 2026
  • far-project-template Public

    Template for new experiment repositories at FAR

    AlignmentResearch/far-project-template’s past year of commit activity
    Python 0 0 0 4 Updated Jan 9, 2026
  • entangled.py Public Forked from rhaps0dy/entangled.py

    Python port of Entangled

    AlignmentResearch/entangled.py’s past year of commit activity
    Python 0 Apache-2.0 12 0 0 Updated Dec 19, 2025
  • kubespray Public Forked from kubernetes-sigs/kubespray

    Deploy a Production Ready Kubernetes Cluster

    AlignmentResearch/kubespray’s past year of commit activity
    Jinja 0 Apache-2.0 7,022 0 0 Updated Dec 12, 2025
  • pytorch Public Forked from pytorch/pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    AlignmentResearch/pytorch’s past year of commit activity
    Python 0 27,930 0 0 Updated Dec 9, 2025
  • accelerate Public Forked from huggingface/accelerate

    🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

    AlignmentResearch/accelerate’s past year of commit activity
    Python 1 Apache-2.0 1,334 0 0 Updated Nov 25, 2025
  • AlignmentResearch/voltage-park-sdk’s past year of commit activity
    Python 0 BSD-3-Clause 0 0 0 Updated Nov 20, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…