INDEX
Explanations
words related to fatal incidents or deaths
New Auto-Interp
Negative Logits
orthy
-0.82
arte
-0.78
Remastered
-0.77
erous
-0.75
annis
-0.75
here
-0.74
rica
-0.74
berman
-0.73
MAS
-0.70
meric
-0.67
POSITIVE LOGITS
istic
1.13
istically
1.12
ities
1.10
flaw
1.04
ized
1.01
accidents
0.96
izing
0.94
shootings
0.93
ist
0.92
fatal
0.90
Activations Density 0.008%