INDEX
Explanations
actions related to transportation or movement of people and objects
New Auto-Interp
Negative Logits
organ
-0.17
circ
-0.15
wards
-0.15
lesen
-0.14
landing
-0.14
ı
-0.14
Desc
-0.14
argar
-0.14
ward
-0.14
esk
-0.14
POSITIVE LOGITS
home
0.28
wherever
0.23
into
0.22
along
0.21
across
0.21
everywhere
0.21
sebou
0.19
along
0.19
home
0.18
Across
0.18
Activations Density 0.115%