INDEX
Explanations
emphatic statements or exclamations
New Auto-Interp
Negative Logits
uto
-0.14
verty
-0.14
jej
-0.14
yte
-0.14
formance
-0.14
enberg
-0.13
APE
-0.13
ushman
-0.13
.Hand
-0.13
.gs
-0.13
POSITIVE LOGITS
sted
0.16
————————————————
0.15
گاÙĨ
0.14
妮
0.14
گاÙĨÛĮ
0.14
İSİ
0.14
CONTRACT
0.13
Cater
0.13
WithError
0.13
endar
0.13
Activations Density 0.036%