INDEX
Explanations
phrases related to criminal activities and investigations
references to planned or intentional criminal activities
New Auto-Interp
Negative Logits
Tokens
-0.88
notation
-0.73
Cosponsors
-0.68
helps
-0.67
urn
-0.67
imei
-0.67
then
-0.66
oise
-0.64
ometers
-0.63
anguages
-0.63
POSITIVE LOGITS
spontaneous
1.18
accidental
1.11
retaliation
1.08
perpetrated
1.07
arson
1.07
revenge
1.03
suicides
1.00
robbery
0.99
suicide
0.98
deliberate
0.96
Activations Density 0.270%