INDEX
Explanations
controversial or politically charged topics and events
New Auto-Interp
Negative Logits
partName
-0.85
izons
-0.69
ciplinary
-0.68
Timer
-0.67
autions
-0.66
anecd
-0.66
iple
-0.64
ishops
-0.63
cautiously
-0.63
concise
-0.62
POSITIVE LOGITS
illegally
1.06
allegedly
1.01
pedoph
0.92
raping
0.91
sexually
0.89
falsely
0.88
purportedly
0.87
homosexuality
0.87
prostitutes
0.87
illegal
0.86
Activations Density 13.590%