INDEX
Explanations
This neuron responds to words signaling improved performance or increased levels—especially comparative/superlative terms (e.g. faster, increased, improved).
New Auto-Interp
Negative Logits
aira
-0.07
placing
-0.06
closely
-0.06
melan
-0.06
Ã
-0.06
nist
-0.06
důsled
-0.06
kin
-0.06
UP
-0.06
tennis
-0.06
POSITIVE LOGITS
-days
0.07
ルフ
0.07
CAT
0.06
ώς
0.06
Greens
0.06
fName
0.06
오
0.06
ủi
0.06
*)
0.06
$msg
0.06
Activations Density 0.027%