INDEX
Explanations
phrases indicating negation or refusal
negation statements, particularly phrases that include "won't."
New Auto-Interp
Negative Logits
soType
-0.66
Floating
-0.63
itiz
-0.60
linkage
-0.60
Kings
-0.58
Forums
-0.58
Pipeline
-0.57
Hardware
-0.57
illustration
-0.56
edIn
-0.56
POSITIVE LOGITS
necessarily
0.94
ardless
0.85
itles
0.81
apest
0.80
ember
0.79
rees
0.78
payers
0.76
urtle
0.76
ournament
0.75
angular
0.75
Activations Density 0.035%