INDEX
Explanations
phrases related to decision-making and taking steps
pronouns and references to collective actions or experiences
New Auto-Interp
Negative Logits
ilty
-0.70
Relations
-0.68
recognition
-0.68
sensing
-0.68
resemblance
-0.66
Reply
-0.64
enjoyment
-0.64
comings
-0.63
conviction
-0.63
witnessing
-0.62
POSITIVE LOGITS
resorted
1.46
opted
1.27
devised
1.26
decided
1.19
instituted
1.09
undertook
1.06
chose
1.04
teamed
1.01
enlisted
1.01
embarked
0.98
Activations Density 0.308%