INDEX
Explanations
phrases related to taking action or making decisions
imperative phrases urging action
New Auto-Interp
Negative Logits
holm
-0.70
Cong
-0.70
eers
-0.68
advertised
-0.65
david
-0.65
Smile
-0.63
agre
-0.62
gian
-0.62
ler
-0.62
Kamp
-0.61
POSITIVE LOGITS
advantage
1.11
aways
1.10
heed
0.97
overs
0.90
care
0.90
aback
0.88
OVER
0.80
autions
0.80
precedence
0.78
precautions
0.77
Activations Density 0.122%