INDEX
Explanations
contractions that indicate negation or denial
New Auto-Interp
Negative Logits
aday
-0.15
outu
-0.15
ÏīÏĤ
-0.15
Evel
-0.14
-valu
-0.14
ableViewController
-0.14
rouw
-0.14
.Îij
-0.14
cip
-0.14
ady
-0.14
POSITIVE LOGITS
necessarily
0.23
matter
0.22
matter
0.19
quite
0.18
exactly
0.18
stop
0.17
b
0.17
change
0.17
aug
0.17
mattered
0.17
Activations Density 0.108%