INDEX
Explanations
the intention or capability to perform an action
phrases indicating potential actions or capabilities of subjects
New Auto-Interp
Negative Logits
irony
-0.64
Wrestling
-0.63
treacher
-0.62
understatement
-0.62
ARK
-0.59
significance
-0.58
wheel
-0.58
hypocrisy
-0.57
agitation
-0.57
addons
-0.57
POSITIVE LOGITS
sylv
1.00
could
0.81
can
0.80
mie
0.77
ctic
0.75
'll
0.74
maximize
0.74
qualify
0.72
arers
0.70
comply
0.70
Activations Density 0.151%