INDEX
Explanations
words related to criminal activities, particularly focusing on terms related to murder
New Auto-Interp
Negative Logits
stem
-0.71
Premium
-0.70
API
-0.69
ost
-0.67
whe
-0.67
Appl
-0.67
Tokens
-0.67
ota
-0.65
ECO
-0.64
PAC
-0.63
POSITIVE LOGITS
murder
3.48
murders
2.61
Murder
2.56
homicide
2.38
murdering
2.18
murderer
2.12
assassination
2.06
murderers
2.05
manslaughter
2.03
killings
2.00
Activations Density 0.028%