INDEX
Explanations
sentiments and expressions of personal beliefs
New Auto-Interp
Negative Logits
tru
-0.15
ebo
-0.14
istrovstvÃŃ
-0.14
OTOR
-0.14
versed
-0.14
_UNIX
-0.14
ılıç
-0.13
ias
-0.13
oton
-0.13
ooks
-0.13
POSITIVE LOGITS
865
0.15
**)&
0.15
834
0.15
coln
0.15
iana
0.15
alink
0.15
лаж
0.15
ucene
0.14
اÙĦدÙĬÙĨ
0.14
Hog
0.14
Activations Density 0.155%