INDEX
Explanations
The neuron strongly activates on the auxiliary “will,” marking future‐tense/modal statements.
New Auto-Interp
Negative Logits
Pett
-0.07
GameManager
-0.07
Clear
-0.07
Bent
-0.07
Planner
-0.07
Terminator
-0.06
ButtonText
-0.06
_shot
-0.06
P
-0.06
-0.06
POSITIVE LOGITS
writers
0.06
writer
0.06
rog
0.06
thù
0.06
hel
0.06
πριν
0.06
eigentlich
0.06
{-#0.06
Mouth
0.06
rophe
0.06
Activations Density 0.022%