INDEX
Explanations
references to violent events and their impact
New Auto-Interp
Negative Logits
982
-0.16
alc
-0.15
810
-0.15
umpt
-0.15
ISP
-0.15
924
-0.14
onga
-0.14
اشÛĮ
-0.13
ahren
-0.13
829
-0.13
POSITIVE LOGITS
suicide
0.24
su
0.23
Suicide
0.21
coordinated
0.19
attack
0.19
blast
0.18
blasts
0.18
attacks
0.18
Targets
0.17
coordination
0.17
Activations Density 0.037%