INDEX
Explanations
mentions of physical injuries or blood-related terms
instances of the word "bleeding" and related terms
New Auto-Interp
Negative Logits
å§«
-0.81
wcsstore
-0.75
bnb
-0.74
adal
-0.74
idable
-0.73
BOOK
-0.69
essee
-0.69
inqu
-0.68
lator
-0.68
fman
-0.66
POSITIVE LOGITS
wounds
0.93
hemorrh
0.89
throats
0.83
blood
0.80
prof
0.79
necks
0.78
bleeding
0.78
wrists
0.77
crimson
0.76
bleed
0.76
Activations Density 0.077%