INDEX
Explanations
phrases related to taking action or actively doing something
phrases related to acting or being in a role
New Auto-Interp
Negative Logits
Gloss
-0.66
artifacts
-0.65
brew
-0.65
aways
-0.65
sinks
-0.64
Printed
-0.64
comings
-0.62
arnaev
-0.62
Riding
-0.61
merce
-0.61
POSITIVE LOGITS
opposite
0.84
behalf
0.79
differently
0.77
metic
0.75
RECT
0.72
athetic
0.68
uously
0.68
actionGroup
0.68
unilaterally
0.68
umbn
0.67
Activations Density 0.094%