INDEX
Explanations
This neuron activates on numeric values (especially decimal or floating‐point numbers) in the text.
New Auto-Interp
Negative Logits
phrases
-0.06
roofs
-0.06
สาม
-0.06
houses
-0.06
(("-0.06
ambre
-0.06
/style
-0.06
Ur
-0.06
AR
-0.06
Heart
-0.06
POSITIVE LOGITS
_nums
0.08
:variables
0.07
Univers
0.07
езульт
0.07
ją
0.07
kaz
0.07
Ortiz
0.07
unfairly
0.07
simply
0.07
leggings
0.07
Activations Density 0.025%