INDEX
Explanations
The neuron selectively fires on occurrences of the word “center” (especially in forms like “centered”).
New Auto-Interp
Negative Logits
Ansi
-0.07
Архів
-0.06
acking
-0.06
acı
-0.06
discussed
-0.06
INDEX
-0.06
ax
-0.06
'_
-0.06
still
-0.06
ुमत
-0.06
POSITIVE LOGITS
#
0.06
¯¯
0.06
'][$
0.06
explodes
0.06
trough
0.06
OfDay
0.06
dva
0.06
).(
0.06
dilig
0.06
Tv
0.06
Activations Density 0.192%