INDEX
Explanations
references to the action of shooting or being shot
New Auto-Interp
Negative Logits
hir
-0.22
hift
-0.17
loi
-0.17
hill
-0.17
ials
-0.16
hire
-0.16
anders
-0.16
hud
-0.16
ason
-0.16
VAL
-0.16
POSITIVE LOGITS
guns
0.34
gun
0.23
ting
0.21
ì°©
0.17
hoops
0.17
tps
0.16
shots
0.16
Fired
0.16
glass
0.15
tober
0.15
Activations Density 0.033%