INDEX
Explanations
statements or questions related to decision-making
references to decision-making processes and considerations
New Auto-Interp
Negative Logits
Mare
-0.72
STAT
-0.72
aud
-0.71
Cah
-0.69
Walls
-0.69
enium
-0.69
Schwe
-0.68
Meng
-0.68
eport
-0.67
Stam
-0.67
POSITIVE LOGITS
choice
2.11
Choice
2.01
choices
1.99
choice
1.95
Choice
1.94
choosing
1.89
choose
1.82
selection
1.77
selections
1.74
chose
1.73
Activations Density 0.416%