INDEX
Explanations
references to violent actions resulting in death
references to fatal violence or death
New Auto-Interp
Negative Logits
soType
-0.90
MpServer
-0.89
Avg
-0.83
ECA
-0.80
Catalog
-0.79
Clar
-0.76
MN
-0.74
Brew
-0.73
Limited
-0.73
CLIENT
-0.73
POSITIVE LOGITS
blow
0.87
adder
0.82
ously
0.81
stroke
0.79
face
0.79
claw
0.78
anguage
0.78
psychiat
0.75
guard
0.74
retard
0.73
Activations Density 0.024%