INDEX
Explanations
instances of human suffering and atrocities
New Auto-Interp
Negative Logits
StoryboardSegue
-0.57
}{*}{}-0.43
distracting
-0.42
RegistryLite
-0.40
distraction
-0.40
Überras
-0.39
fassungs
-0.39
distractions
-0.38
distract
-0.38
neutralized
-0.38
POSITIVE LOGITS
abuse
1.56
torture
1.55
tort
1.34
tort
1.32
abuses
1.28
tortura
1.28
abuse
1.27
Abuse
1.25
Abuse
1.25
abused
1.24
Activations Density 0.813%