INDEX
Explanations
decisions or actions being made or taken
statements related to decision-making processes
New Auto-Interp
Negative Logits
dig
-0.80
aeus
-0.70
ritical
-0.68
heres
-0.67
ugs
-0.66
naissance
-0.66
nces
-0.65
osponsors
-0.65
Versions
-0.64
riet
-0.62
POSITIVE LOGITS
decisions
1.08
decision
1.02
deciding
0.98
priorit
0.88
deliber
0.86
whether
0.83
advis
0.80
Decision
0.80
unanimously
0.80
wisely
0.78
Activations Density 0.218%