INDEX
Explanations
instances of knocking or door-related actions
New Auto-Interp
Negative Logits
chain
-0.17
åīĤ
-0.17
ipo
-0.15
amburger
-0.14
chains
-0.14
å¼
-0.14
_ABS
-0.14
sø
-0.14
Mec
-0.14
åĨĨ
-0.14
POSITIVE LOGITS
door
0.23
Door
0.21
_door
0.20
-door
0.20
éĸĢ
0.19
éŨ
0.19
Door
0.18
unlocked
0.17
knock
0.16
door
0.16
Activations Density 0.109%