INDEX
Explanations
affirmative responses or confirmations
New Auto-Interp
Negative Logits
_LSB
-0.15
apon
-0.15
aldi
-0.15
aux
-0.14
ubo
-0.14
_native
-0.13
cosa
-0.13
ita
-0.13
éli
-0.13
nisi
-0.13
POSITIVE LOGITS
plural
0.26
indeed
0.25
plural
0.21
yes
0.20
seriously
0.19
mesmo
0.18
despite
0.18
heard
0.18
Indeed
0.17
yes
0.17
Activations Density 0.057%