INDEX
Explanations
The neuron never activates on any tokens, effectively detecting the absence of any targeted pattern.
New Auto-Interp
Negative Logits
�
-0.07
drones
-0.07
plaint
-0.07
tart
-0.07
.lineWidth
-0.07
Attacks
-0.07
.RegularExpressions
-0.06
题
-0.06
triangles
-0.06
msgstr
-0.06
POSITIVE LOGITS
,class
0.06
Dell
0.06
(↵
0.06
bouts
0.06
meydana
0.06
'}↵
0.06
äll
0.06
Совет
0.06
Farage
0.06
(Call
0.06
Activations Density 0.002%