sae-evolver

logo

Inspiration

Google's recent release of sparse autoencoders for their gemma series of large language models inspired us to search and understand what they were. As a developer, imagine making your llm happier or better at storytelling just by saying so.

What it does

Our project takes Google's newly released sparse autoencoders and makes them operational. We developed steering code that allows effective control and utilization of these autoencoders, enabling various text transformations like generating rhymes, altering emotional tones, or creating detailed image prompts.

How we built it

We obtained the sparse autoencoders (GemmaScope) recently released by Google.
Upon discovering that the existing code didn't work out of the box, we developed our own steering code.
We created a user-friendly command line to demonstrate the capabilities of these autoencoders. All you need to do is specify the dimension you want to search for (ex: storytelling, politeness, inspirational, etc)

Challenges we ran into

Developing steering code from scratch when the provided code didn't function as expected
Understanding and effectively manipulating sparse autoencoders
Defining the evolutionary search space & evolutionary search. Special thanks to Claude for this one.

Accomplishments that we're proud of

Successfully implementing functional steering code for Google's sparse autoencoders
Rapid development and problem-solving in a cutting-edge AI domain
Combining evolutionary search with pretrained LLM capabilities