INDEX
Explanations
verbs that indicate making a conscious decision
instances of the word "choose" indicating decisions or choices being made
New Auto-Interp
Negative Logits
brance
-0.78
urst
-0.70
uum
-0.69
utenberg
-0.67
otor
-0.66
atomic
-0.64
ijk
-0.64
Net
-0.63
awar
-0.61
iter
-0.61
POSITIVE LOGITS
chooses
0.89
chose
0.89
choose
0.83
axe
0.81
chosen
0.78
choices
0.76
ACTIONS
0.76
choosing
0.71
wisely
0.70
çĶŁ
0.70
Activations Density 0.026%