INDEX
Explanations
mentions of individuals being found deceased or harmed
instances of people being discovered in a deceased state
New Auto-Interp
Negative Logits
dayName
-0.83
yip
-0.83
lishes
-0.81
yright
-0.72
creation
-0.70
dos
-0.65
eatures
-0.65
rats
-0.64
expectations
-0.64
annis
-0.63
POSITIVE LOGITS
guilty
1.05
Guilty
0.92
dead
0.86
tampering
0.78
âĸĪ
0.78
unfit
0.74
wanting
0.74
lying
0.73
dylib
0.72
liable
0.72
Activations Density 0.056%