INDEX
Explanations
instances of people or objects physically turning around
instances of turning or rotating actions
New Auto-Interp
Negative Logits
rought
-0.65
ites
-0.60
addiction
-0.58
implants
-0.58
ulla
-0.57
erness
-0.55
çļ
-0.55
cliffe
-0.54
lements
-0.54
inks
-0.54
POSITIVE LOGITS
sideways
0.82
lda
0.72
emort
0.72
abruptly
0.71
startled
0.69
scrolling
0.69
GOODMAN
0.68
peed
0.66
undo
0.66
180
0.65
Activations Density 0.156%