INDEX
Explanations
This neuron selectively activates on numeric tokens and quantified expressions (e.g., numbers, decimals, measurements, and reference indices).
New Auto-Interp
Negative Logits
mismo
-0.07
無
-0.07
Vertical
-0.06
住
-0.06
kid
-0.06
학생
-0.06
约
-0.06
�
-0.06
Smile
-0.06
_strength
-0.06
POSITIVE LOGITS
-Assad
0.06
стор
0.06
!!.
0.06
欧
0.06
crowded
0.06
-oper
0.06
UIGraphics
0.06
了解
0.06
โลก
0.06
-ext
0.06
Activations Density 0.223%