INDEX
Explanations
terms related to beliefs, assumptions, and acknowledgments
phrases that express common beliefs or widely held views
New Auto-Interp
Negative Logits
bound
-0.69
ĪĴ
-0.66
adra
-0.65
alos
-0.64
ubi
-0.62
Correction
-0.59
kay
-0.59
pan
-0.58
agraph
-0.58
manent
-0.58
POSITIVE LOGITS
isSpecialOrderable
0.84
underestimate
0.77
Occupations
0.75
skept
0.74
Synopsis
0.71
internationally
0.67
STD
0.65
taboo
0.65
amongst
0.63
derog
0.63
Activations Density 0.172%