INDEX
Explanations
mentions of bodily harm or injury
references to physical injury or harm
New Auto-Interp
Negative Logits
Developers
-0.76
jam
-0.67
Arch
-0.66
UTC
-0.65
OP
-0.65
arp
-0.64
arch
-0.63
developers
-0.63
former
-0.62
Today
-0.62
POSITIVE LOGITS
bodily
3.78
physiological
1.21
Bod
1.13
genital
1.10
bowel
1.05
ODY
1.05
limbs
1.05
disemb
1.04
earthly
1.04
genitals
1.01
Activations Density 0.016%