INDEX
Explanations
phrases related to taking action or following a specific sequence of steps to achieve a goal
phrases that introduce objectives or sequences of actions
New Auto-Interp
Negative Logits
tek
-0.75
rament
-0.70
vous
-0.69
child
-0.68
tur
-0.67
laus
-0.67
leigh
-0.67
tiny
-0.67
vana
-0.66
Dirk
-0.65
POSITIVE LOGITS
lies
0.79
Osw
0.76
liness
0.73
order
0.71
xual
0.69
eous
0.67
ournals
0.66
arily
0.65
orchestr
0.65
ogue
0.64
Activations Density 0.017%