INDEX
Explanations
phrases indicating coercion or forceful actions
instances of being compelled or forced to take specific actions
New Auto-Interp
Negative Logits
laughter
-0.72
Compass
-0.66
Blueprint
-0.66
icious
-0.65
Prediction
-0.65
Garland
-0.64
cli
-0.63
Messages
-0.63
calling
-0.63
ergy
-0.61
POSITIVE LOGITS
endure
1.09
confront
1.01
reckon
1.00
rethink
0.96
fend
0.93
reconsider
0.92
flee
0.87
cancel
0.87
abandon
0.86
undertake
0.84
Activations Density 0.068%