INDEX
Explanations
Gemma team at Google DeepMind
mentions of the model name "Gemma" or tokens referring to the assistant's identity.
New Auto-Interp
Negative Logits
p
2.58
a
2.31
of
2.02
de
1.77
b
1.73
an
1.70
c
1.68
g
1.60
y
1.41
sp
1.40
POSITIVE LOGITS
م
1.19
ер
1.08
м
1.06
fertil
1.05
großen
1.04
নি
1.03
(),
1.02
různých
1.02
sơn
1.01
ુર
0.98
Activations Density 0.620%