INDEX
Explanations
negations and expressions of disapproval or reluctance
New Auto-Interp
Negative Logits
ricks
-0.17
avou
-0.16
proh
-0.15
quette
-0.15
IELDS
-0.15
-coordinate
-0.14
ÑĩаÑģ
-0.14
prit
-0.14
911
-0.13
ignum
-0.13
POSITIVE LOGITS
aign
0.14
accepting
0.14
_EXT
0.14
ازÙĩ
0.14
á»ķ
0.14
Grimm
0.14
icker
0.13
adan
0.13
reliance
0.13
Sharp
0.13
Activations Density 0.123%