INDEX
Explanations
The neuron activates on occurrences of the word “numbers,” especially in code contexts (e.g., variable or function names referencing “numbers”).
New Auto-Interp
Negative Logits
jade
-0.06
.One
-0.06
Magic
-0.06
.Black
-0.06
全
-0.06
↵
-0.06
policy
-0.06
////
-0.06
prefect
-0.06
/n
-0.06
POSITIVE LOGITS
εισ
0.07
igi
0.07
Feb
0.06
hdc
0.06
vaping
0.06
тр
0.06
predecessor
0.06
Fraction
0.06
>'.
0.06
Pri
0.06
Activations Density 0.016%