INDEX
Explanations
instances of quick movement or escape
actions of running or fleeing
New Auto-Interp
Negative Logits
artisan
-0.75
olia
-0.74
ortium
-0.68
repre
-0.68
orum
-0.60
Birth
-0.57
defic
-0.57
Publication
-0.56
urers
-0.56
enshr
-0.55
POSITIVE LOGITS
away
1.09
nin
1.06
aways
1.02
hither
1.00
away
0.98
err
0.97
awa
0.95
gs
0.92
screaming
0.88
downhill
0.87
Activations Density 0.057%