INDEX
Explanations
fiction excerpts
This neuron detects special “speaker” or control tokens that mark system/user/assistant headers and in-role instructions.
New Auto-Interp
Negative Logits
узы
-0.07
%m
-0.07
ESSAGES
-0.06
assertions
-0.06
.getcwd
-0.06
InputStream
-0.06
subjected
-0.06
chain
-0.06
one
-0.06
GORITH
-0.06
POSITIVE LOGITS
grind
0.08
tahun
0.07
govern
0.07
�
0.07
_display
0.07
exhibiting
0.07
ผม
0.06
ص
0.06
τικής
0.06
bát
0.06
Activations Density 0.032%