INDEX
Explanations
instances of physical movement actions related to running away or escaping
actions related to running or fleeing
New Auto-Interp
Negative Logits
pers
-0.81
tle
-0.75
manship
-0.73
omsky
-0.71
iod
-0.70
cius
-0.68
Twice
-0.67
ancial
-0.67
etheless
-0.67
sych
-0.66
POSITIVE LOGITS
unnoticed
0.86
undet
0.81
rooft
0.80
alley
0.71
groceries
0.70
bushes
0.69
hills
0.68
onew
0.68
ãĥ´
0.67
bush
0.67
Activations Density 0.193%