INDEX
Explanations
references to destruction, suffering, and the impact of conflict on communities
New Auto-Interp
Negative Logits
restraining
-0.16
eca
-0.15
ÑİÑĢ
-0.15
jur
-0.14
поÑģк
-0.14
extrad
-0.14
iji
-0.14
cesso
-0.14
iez
-0.14
_cg
-0.13
POSITIVE LOGITS
destruction
0.50
destroy
0.50
destroyed
0.49
destroy
0.47
Destroy
0.46
.destroy
0.43
DEST
0.42
_destroy
0.41
destroying
0.41
destroys
0.41
Activations Density 0.273%