INDEX
Explanations
crimes against humanity and violence
New Auto-Interp
Negative Logits
柠檬
0.66
图
0.65
સરળ
0.63
yardımcı
0.62
préférences
0.61
流畅
0.61
IntelliJ
0.60
பரபர
0.60
鸟
0.60
XML
0.58
POSITIVE LOGITS
atrocities
1.37
genocide
1.30
killings
1.30
rocities
1.24
brutality
1.20
horrific
1.16
massacre
1.15
massac
1.08
violence
1.05
injustices
1.05
Activations Density 0.149%