INDEX
Explanations
Formal language
The neuron selectively responds to short grammatical connector words (function words) in running text.
New Auto-Interp
Negative Logits
unpl
-0.07
militant
-0.07
Fuk
-0.06
ánchez
-0.06
laboratories
-0.06
iz
-0.06
que
-0.06
Ibn
-0.06
viz
-0.06
instruction
-0.06
POSITIVE LOGITS
ються
0.08
této
0.07
Dare
0.06
موبایل
0.06
acciones
0.06
')}>↵
0.06
ONLY
0.06
。他
0.06
↵
0.06
Trash
0.06
Activations Density 0.136%