INDEX
Explanations
The neuron activates primarily on the word “Monster,” especially when used as a headline or title token.
New Auto-Interp
Negative Logits
Painting
-0.08
�
-0.07
Faculty
-0.07
circum
-0.07
Catherine
-0.07
Geb
-0.07
Communication
-0.07
Dou
-0.07
अव
-0.06
Cou
-0.06
POSITIVE LOGITS
monster
0.13
monsters
0.09
Monsters
0.09
disasters
0.09
Monster
0.09
disaster
0.08
Beast
0.08
MAN
0.08
beast
0.08
Monster
0.08
Activations Density 0.006%