INDEX
Explanations
The neuron selectively activates on occurrences of the word “older.”
New Auto-Interp
Negative Logits
861
-0.07
journey
-0.06
FUNCTION
-0.06
Comb
-0.06
(fm
-0.06
_cum
-0.06
153
-0.06
_param
-0.06
Given
-0.06
tunnels
-0.06
POSITIVE LOGITS
older
0.13
newer
0.09
newest
0.08
Older
0.08
ốc
0.08
پست
0.07
över
0.07
punk
0.07
.over
0.07
iger
0.07
Activations Density 0.008%