INDEX
Explanations
phrases related to alternative options or choices
words related to tasks and actions
New Auto-Interp
Negative Logits
Nobody
-0.83
nobody
-0.81
Nobody
-0.75
hadn
-0.72
itiveness
-0.66
hest
-0.64
Five
-0.63
nothing
-0.61
Seven
-0.61
Nine
-0.60
POSITIVE LOGITS
alternatively
1.22
Alternatively
1.05
optionally
1.04
Alternatively
0.99
substituted
0.94
utilize
0.94
depending
0.89
yip
0.89
util
0.88
utilized
0.85
Activations Density 0.494%