INDEX
Explanations
This neuron responds to verbs describing game actions or mechanics (e.g., upgrading, invading, pillaging).
New Auto-Interp
Negative Logits
forg
-0.06
роботу
-0.06
affinity
-0.06
Bilg
-0.06
regenerated
-0.06
noct
-0.06
MULTI
-0.06
Gram
-0.06
unequal
-0.06
ב
-0.06
POSITIVE LOGITS
톡
0.06
wore
0.06
arrangements
0.06
Newtown
0.06
ราย
0.06
ppelin
0.06
аза
0.06
Singapore
0.06
rock
0.06
тоже
0.06
Activations Density 0.323%