INDEX
Explanations
references to military weaponry and artillery
New Auto-Interp
Negative Logits
knife
-0.16
åī£
-0.15
Mach
-0.15
erville
-0.15
dagger
-0.15
Slut
-0.15
Lind
-0.15
rive
-0.14
Axel
-0.14
пÑĢиÑĤ
-0.14
POSITIVE LOGITS
battery
0.26
batteries
0.26
Battery
0.24
shells
0.23
Batter
0.21
Battery
0.21
æ¦
0.21
canon
0.21
Art
0.20
batter
0.20
Activations Density 0.034%