INDEX
Explanations
negations and phrases of denial
New Auto-Interp
Negative Logits
strup
-0.16
leton
-0.15
BuilderFactory
-0.14
à¹ĥห
-0.14
zhou
-0.14
áºŃt
-0.14
_chg
-0.14
imar
-0.13
Ware
-0.13
rowspan
-0.13
POSITIVE LOGITS
شر
0.19
iface
0.16
ETS
0.16
just
0.15
Äħż
0.15
just
0.15
ensch
0.14
CHK
0.14
ets
0.14
Tort
0.14
Activations Density 0.036%