INDEX
Explanations
verbs or phrases indicating desires or intentions
expressions of intention and desire related to goals and actions
New Auto-Interp
Negative Logits
river
-0.84
Marino
-0.67
Lima
-0.64
Pak
-0.64
Mellon
-0.62
banks
-0.60
Fey
-0.59
Bett
-0.59
floods
-0.58
riot
-0.58
POSITIVE LOGITS
accomplish
1.13
achieve
1.06
emulate
0.98
improve
0.95
conserve
0.89
eliminate
0.87
learn
0.84
solve
0.84
introduce
0.83
bring
0.83
Activations Density 0.160%