INDEX
Explanations
The neuron responds to numeric tokens that look like floating-point confidence scores.
New Auto-Interp
Negative Logits
GUIStyle
-0.07
Hag
-0.06
їм
-0.06
.cpu
-0.06
ワ
-0.06
ระบบ
-0.06
(tp
-0.06
generator
-0.06
Napoleon
-0.06
Choosing
-0.06
POSITIVE LOGITS
!↵↵↵
0.07
واج
0.06
ある
0.06
胜
0.06
_conversion
0.06
ematic
0.06
balanced
0.06
dịch
0.06
.netbeans
0.06
;',
0.06
Activations Density 0.014%