INDEX
Explanations
This neuron doesn’t detect or respond to any specific text patterns—it remains inactive.
New Auto-Interp
Negative Logits
France
-0.07
칠
-0.07
aso
-0.07
leaking
-0.07
_Att
-0.07
Devil
-0.06
Рус
-0.06
talking
-0.06
'Brien
-0.06
Brown
-0.06
POSITIVE LOGITS
very
0.09
Very
0.08
’я
0.08
downright
0.08
(move
0.07
самых
0.07
(varargin
0.07
bed
0.07
červ
0.06
ARY
0.06
Activations Density 0.007%