INDEX
Explanations
phrases related to physical actions directed towards objects
phrases related to actions of lifting, breaking, or moving objects
New Auto-Interp
Negative Logits
interstitial
-0.64
DEFENSE
-0.61
Annotations
-0.59
divergence
-0.58
Defendants
-0.57
tert
-0.56
Featured
-0.56
Occupations
-0.55
departures
-0.55
backdrop
-0.54
POSITIVE LOGITS
cheaply
0.88
handedly
0.87
nicely
0.86
yourself
0.85
somew
0.84
roy
0.81
herself
0.78
yourselves
0.78
gently
0.77
properly
0.77
Activations Density 0.148%