INDEX
Explanations
decisions or mentions of decision-making
instances of the word "decision" in various contexts
New Auto-Interp
Negative Logits
ilus
-0.71
aunder
-0.70
ols
-0.68
atures
-0.66
esters
-0.65
ets
-0.63
Sources
-0.63
hin
-0.62
hidden
-0.62
onge
-0.62
POSITIVE LOGITS
decision
3.50
decisions
2.64
Decision
2.44
judgement
1.82
determination
1.76
choice
1.72
judgment
1.66
deciding
1.62
ruling
1.48
choices
1.43
Activations Density 0.019%