INDEX
Explanations
mentions of military actions and geopolitical tensions
New Auto-Interp
Negative Logits
umi
-0.14
doma
-0.14
<tag
-0.14
undry
-0.13
uktur
-0.13
ê¶ģ
-0.13
ometown
-0.13
reet
-0.13
eder
-0.13
iamond
-0.12
POSITIVE LOGITS
aggression
0.34
provoc
0.32
invasion
0.32
provocative
0.31
prov
0.30
annex
0.28
aggressive
0.28
ä¾µ
0.27
Prov
0.27
invade
0.26
Activations Density 0.082%