INDEX
Explanations
terms related to violent actions such as bombings
mentions of bombing in various contexts
New Auto-Interp
Negative Logits
laus
-0.87
eva
-0.77
Lear
-0.77
PRE
-0.75
Perfect
-0.75
Vert
-0.72
ITY
-0.72
GRE
-0.72
learn
-0.71
Topics
-0.69
POSITIVE LOGITS
bombing
1.40
bombings
1.26
raids
1.16
bomber
1.08
spree
1.04
bombers
1.03
barr
0.99
bombard
0.97
bombardment
0.93
bombed
0.93
Activations Density 0.012%