INDEX
Explanations
words related to making decisions, particularly when the decisions are important or impactful
phrases related to decision-making processes
New Auto-Interp
Negative Logits
amina
-0.77
Dak
-0.73
nic
-0.72
vae
-0.70
rome
-0.69
amen
-0.68
rake
-0.66
jew
-0.66
rica
-0.66
uum
-0.63
POSITIVE LOGITS
decisions
1.36
choices
0.99
decision
0.94
makers
0.77
ACTIONS
0.77
levers
0.75
stances
0.74
calculus
0.74
ulkan
0.70
regarding
0.70
Activations Density 0.015%