INDEX
Explanations
measurements/comparisons
This neuron detects comparative adjectives or adverbs indicating a relative “more/less” quality (e.g., “leichter,” “weicher”).
New Auto-Interp
Negative Logits
happier
-0.08
lower
-0.08
larger
-0.07
better
-0.07
more
-0.07
less
-0.07
24
-0.07
%)↵↵
-0.07
nicer
-0.07
easier
-0.07
POSITIVE LOGITS
"><?
0.06
getObject
0.06
Pg
0.06
прек
0.06
disposit
0.06
(key
0.06
hard
0.06
گذ
0.06
potřeb
0.06
belir
0.06
Activations Density 0.152%