INDEX
Explanations
references to incidents involving firearms or violence
New Auto-Interp
Negative Logits
blinking
-0.15
desired
-0.15
gue
-0.14
desired
-0.14
Debugger
-0.14
blink
-0.14
ÏĦοÏį
-0.14
dbg
-0.14
à¤Ĺर
-0.14
sink
-0.14
POSITIVE LOGITS
dial
0.23
sprint
0.22
dash
0.21
dia
0.20
Grab
0.20
rush
0.20
radio
0.19
ph
0.19
rushed
0.19
heard
0.19
Activations Density 0.199%