INDEX
Explanations
references to infamous crimes and their perpetrators
New Auto-Interp
Negative Logits
åıĭ
-0.17
enza
-0.16
izza
-0.15
altung
-0.15
smugg
-0.14
antar
-0.14
iggers
-0.14
steen
-0.14
smuggling
-0.14
esterday
-0.14
POSITIVE LOGITS
serial
0.24
serial
0.23
Serial
0.19
spree
0.18
Serial
0.18
.serial
0.17
abilia
0.17
acker
0.16
(serial
0.16
prow
0.16
Activations Density 0.067%