INDEX
Explanations
occurrences of violence and conflict involving groups and individuals
New Auto-Interp
Negative Logits
ãĥ«ãĥī
-0.17
656
-0.16
659
-0.16
crew
-0.14
HAND
-0.14
bubble
-0.14
رز
-0.14
explo
-0.14
irus
-0.14
è¯
-0.14
POSITIVE LOGITS
burn
0.24
torch
0.23
storm
0.23
burned
0.22
burn
0.21
burning
0.21
burnt
0.21
block
0.20
pill
0.20
storm
0.20
Activations Density 0.124%