INDEX
Explanations
comparative phrases related to capabilities and expectations
New Auto-Interp
Negative Logits
rupa
-0.15
cakes
-0.14
only
-0.14
Lah
-0.14
Boeh
-0.14
place
-0.14
higher
-0.14
dep
-0.14
cake
-0.13
extremism
-0.13
POSITIVE LOGITS
usual
0.17
strict
0.15
antro
0.15
ervo
0.15
any
0.15
اÙĤع
0.14
Strict
0.14
ouri
0.14
cope
0.14
COPE
0.14
Activations Density 0.091%