INDEX
Explanations
negations or negative qualifiers in the text
New Auto-Interp
Negative Logits
adera
-0.16
ittal
-0.16
æ¡Ī
-0.16
èŃ
-0.15
ãĥ¼ãĥ¬
-0.14
angelo
-0.14
اÙĦÙĩ
-0.14
ÃŃst
-0.13
ise
-0.13
ummer
-0.13
POSITIVE LOGITS
.scalablytyped
0.16
èo
0.15
olls
0.15
icari
0.15
çļ
0.14
chaud
0.14
azen
0.14
Verdana
0.13
æİ
0.13
vail
0.13
Activations Density 0.005%