INDEX
Explanations
This neuron activates on the word “able,” i.e. constructions expressing that something is capable of or able to do something.
New Auto-Interp
Negative Logits
editions
-0.06
прос
-0.06
bgColor
-0.06
_red
-0.06
dor
-0.06
hips
-0.06
ávky
-0.06
ylim
-0.06
communities
-0.06
issue
-0.05
POSITIVE LOGITS
tabela
0.08
První
0.07
είται
0.07
enact
0.07
>↵
0.07
_piece
0.07
sera
0.07
guit
0.07
:↵↵
0.07
>',↵
0.06
Activations Density 0.030%