INDEX
Explanations
contradictory statements
expressions of uncertainty or ambiguity
New Auto-Interp
Negative Logits
senal
-0.85
eleph
-0.78
subur
-0.68
exting
-0.65
activ
-0.63
ounter
-0.63
anmar
-0.62
inav
-0.61
pione
-0.60
ò
-0.60
POSITIVE LOGITS
caveats
0.90
editors
0.75
Krugman
0.75
paraph
0.75
reader
0.75
nonetheless
0.75
commenters
0.74
apologies
0.74
readers
0.74
Slate
0.73
Activations Density 1.327%