Gemma Scope
A set of interpretability tools built to help researchers understand the inner workings of Gemma 2
Examine the behavior of individual model layers, while the model processes requests — to help address critical concerns including hallucinations, biases, and manipulation.
Watch
Overview
Gemma Scope provides researchers with a suite of sparse autoencoders. Think of these as microscopes that let you zoom in on dense, compressed activations, and expand them to larger, sparser, more interpretable forms.
Capabilities
-
Perform mechanistic interpretability research
Evaluate the precise behavior of Gemma 2 models with layer-level analysis.
-
Debug model behavior
Pinpoint the source of specific model issues (such as biases and hallucinations) by examining layer-specific representations.
Explore Gemma Scope
Watch