INDEX
Explanations
phrases related to consequences or outcomes
referring expressions that highlight a consequence or result of some action
New Auto-Interp
Negative Logits
andon
-0.85
mares
-0.83
rences
-0.83
NetMessage
-0.82
bots
-0.78
oots
-0.77
DEV
-0.77
ople
-0.75
events
-0.74
uden
-0.73
POSITIVE LOGITS
result
1.58
consequence
1.47
reminder
1.32
testament
1.14
precaution
1.12
matter
1.08
bonus
1.04
consolation
1.02
footnote
1.02
refres
0.98
Activations Density 0.046%