INDEX
Explanations
phrases or words related to taking action or being proactive
phrases related to action-oriented content
New Auto-Interp
Negative Logits
atten
-0.76
linen
-0.75
hedral
-0.73
herself
-0.70
avorite
-0.70
wine
-0.70
thora
-0.69
itent
-0.69
olls
-0.68
chell
-0.67
POSITIVE LOGITS
inaction
0.78
BAT
0.72
ãĤŃ
0.71
ICT
0.71
behavi
0.69
SPONSORED
0.68
mode
0.68
ãĥŀ
0.68
behav
0.67
psi
0.67
Activations Density 0.553%