INDEX
Explanations
contractions indicating negation or abbreviation
New Auto-Interp
Negative Logits
çļ
-0.88
çīĪ
-0.74
èĪ
-0.73
è¦ļéĨĴ
-0.73
inav
-0.73
æĥ
-0.69
obo
-0.69
çĶŁ
-0.68
velt
-0.68
onyms
-0.68
POSITIVE LOGITS
necessarily
1.37
exactly
1.22
gonna
1.13
bothering
1.05
quite
1.04
icable
1.00
supposed
0.98
really
0.96
bothered
0.94
terribly
0.93
Activations Density 0.063%