INDEX
Explanations
terms related to pedophilia and sexual misconduct
references to sexual offenses, particularly against minors
New Auto-Interp
Negative Logits
tune
-0.82
FORM
-0.78
shift
-0.68
Nun
-0.68
univers
-0.68
wisdom
-0.68
ISO
-0.67
bearer
-0.66
resemblance
-0.66
model
-0.66
POSITIVE LOGITS
estation
1.09
assing
1.07
inations
1.07
inating
1.07
appings
1.06
ilings
1.06
acist
1.05
ocide
1.02
ination
0.97
ruption
0.96
Activations Density 0.132%