INDEX
Explanations
phrases related to decision-making
New Auto-Interp
Negative Logits
Catalog
-0.91
lator
-0.76
Eye
-0.76
oxide
-0.74
thal
-0.73
atari
-0.72
odder
-0.72
oven
-0.69
oola
-0.68
ufact
-0.64
POSITIVE LOGITS
soever
0.89
anyone
0.88
they
0.86
anybody
0.79
there
0.77
it
0.71
respondents
0.71
fy
0.70
he
0.69
intentional
0.67
Activations Density 0.471%