INDEX
Explanations
actions related to physical movement
actions related to physical movements or changes
New Auto-Interp
Negative Logits
lihood
-0.74
etheless
-0.73
counterpart
-0.69
hampered
-0.67
yet
-0.66
lacking
-0.65
territ
-0.65
Cub
-0.65
lacks
-0.62
consequ
-0.60
POSITIVE LOGITS
herself
0.82
toilets
0.80
knives
0.79
needles
0.72
ositories
0.71
animate
0.71
igate
0.71
feces
0.70
himself
0.70
adden
0.69
Activations Density 0.566%