INDEX
Explanations
The neuron fires on numeric tokens (especially decimals, version numbers, timestamps).
New Auto-Interp
Negative Logits
uncomfortable
-0.07
한번
-0.07
forder
-0.07
bonus
-0.06
autorelease
-0.06
_first
-0.06
ILD
-0.06
birinci
-0.06
stomach
-0.06
AD
-0.06
POSITIVE LOGITS
UserService
0.07
='".
0.06
sanitary
0.06
hij
0.06
semiclass
0.06
месяца
0.06
conformity
0.06
=sys
0.06
-org
0.06
tsx
0.06
Activations Density 0.002%