INDEX
Explanations
Helping verbs
The neuron activates on hedging or speculative language—especially occurrences of modal verbs like “may” or “might.”
New Auto-Interp
Negative Logits
Dude
-0.06
Jim
-0.06
Simon
-0.06
画
-0.06
ука
-0.06
458
-0.06
林
-0.06
翰
-0.06
Brewer
-0.06
tote
-0.06
POSITIVE LOGITS
ルの
0.07
บาล
0.07
OCI
0.06
pošk
0.06
बदल
0.06
наблю
0.06
verbose
0.06
-sum
0.06
(Application
0.06
ा,
0.06
Activations Density 0.040%