INDEX
Explanations
instances where the text signals that none of a set of options are applicable or relevant
phrases indicating negation or absence
New Auto-Interp
Negative Logits
lished
-0.55
bledon
-0.54
srf
-0.53
widest
-0.53
turf
-0.53
drift
-0.53
seas
-0.52
Drift
-0.51
descent
-0.51
Beir
-0.51
POSITIVE LOGITS
conom
1.25
lect
0.93
theless
0.91
ffect
0.87
galitarian
0.85
except
0.84
uther
0.82
essee
0.79
etting
0.78
lust
0.77
Activations Density 0.023%