INDEX
Explanations
instances where actions are being taken or potential outcomes are being considered
qualifying adverbs that indicate uncertainty or variability
New Auto-Interp
Negative Logits
ayers
-0.82
acas
-0.73
itter
-0.72
achev
-0.71
ierrez
-0.71
kees
-0.70
Ĥİ
-0.69
ente
-0.69
atched
-0.68
iren
-0.68
POSITIVE LOGITS
outright
0.94
downright
0.89
even
0.85
entire
0.74
assassinate
0.73
entirety
0.67
worse
0.67
fatally
0.65
whole
0.63
vice
0.63
Activations Density 0.160%