INDEX
Explanations
comparative phrases indicating preference or contrast
New Auto-Interp
Negative Logits
eru
-0.18
essen
-0.16
azon
-0.16
616
-0.15
mun
-0.14
orf
-0.14
due
-0.14
ocities
-0.14
èŃľ
-0.14
lass
-0.13
POSITIVE LOGITS
otherwise
0.23
ones
0.23
others
0.22
other
0.22
mere
0.20
otherwise
0.19
any
0.19
alternatives
0.18
actual
0.17
simply
0.17
Activations Density 0.129%