INDEX
Explanations
This neuron flags occurrences of the word “digit.”
New Auto-Interp
Negative Logits
march
-0.07
Awareness
-0.07
Wall
-0.07
نمی
-0.07
moms
-0.07
awareness
-0.06
18
-0.06
Accum
-0.06
Patio
-0.06
❤
-0.06
POSITIVE LOGITS
digits
0.10
_digits
0.08
dign
0.07
digging
0.07
digit
0.07
چی
0.07
SCRIPT
0.07
digit
0.07
.Down
0.07
divert
0.07
Activations Density 0.006%