INDEX
Explanations
phrases that express excessiveness or negativity in a situation
New Auto-Interp
Negative Logits
abus
-0.14
rieb
-0.14
anness
-0.14
lier
-0.14
ávÄĽ
-0.14
à¸ļรร
-0.13
odus
-0.13
olab
-0.13
æĽ´
-0.13
orthand
-0.13
POSITIVE LOGITS
much
0.33
Much
0.29
much
0.28
Much
0.26
MUCH
0.24
many
0.20
far
0.19
_MANY
0.19
soon
0.18
many
0.18
Activations Density 0.031%