INDEX
Explanations
terms related to fatal incidents or deaths
New Auto-Interp
Negative Logits
ced
-0.78
ifully
-0.75
blade
-0.75
leck
-0.74
vict
-0.73
plets
-0.73
cia
-0.73
bered
-0.72
worth
-0.71
chev
-0.68
POSITIVE LOGITS
Interpret
0.75
MIT
0.73
Nile
0.64
oshop
0.64
Amen
0.63
Breach
0.62
Butterfly
0.62
isan
0.61
Farmer
0.60
tumble
0.60
Activations Density 0.037%