INDEX
Explanations
affirmative statements and expressions of agreement
New Auto-Interp
Negative Logits
Viet
-0.17
cth
-0.16
dej
-0.15
vit
-0.15
rippling
-0.14
ohen
-0.14
jem
-0.14
رض
-0.14
.AddItem
-0.14
_ALLOW
-0.14
POSITIVE LOGITS
ạp
0.16
204
0.15
038
0.14
اÙĬÙĦ
0.14
ignum
0.14
Sense
0.14
olem
0.14
aman
0.14
ailer
0.13
{{--<0.13
Activations Density 0.158%