INDEX
Negative Logits
interchangeable
-0.08
atawa
-0.08
utawa
-0.08
الخار
-0.08
atanapi
-0.07
некалькі
-0.07
birkaç
-0.07
hesitate
-0.07
Frequently
-0.07
pard
-0.07
POSITIVE LOGITS
符合
0.10
fulfilled
0.08
mechanism
0.08
demonstrates
0.08
验证
0.08
satisfaction
0.08
satisfied
0.08
满足
0.08
agreeable
0.08
验
0.08
Activations Density 0.114%