INDEX
Explanations
The neuron fires on occurrences of “want” (particularly in “want to”), i.e. expressions of desire or intended action.
New Auto-Interp
Negative Logits
봉
-0.07
Buzz
-0.07
mic
-0.06
benz
-0.06
fullscreen
-0.06
ml
-0.06
.Movie
-0.06
ُل
-0.06
佐
-0.06
/themes
-0.06
POSITIVE LOGITS
\C
0.07
'ils
0.07
intimidate
0.06
ENCIL
0.06
>",↵
0.06
reducers
0.06
阪
0.06
ìm
0.06
clashed
0.06
lique
0.06
Activations Density 0.036%