INDEX
Explanations
country names and their associated actions or relationships
references to specific countries and their geopolitical activities
New Auto-Interp
Negative Logits
åĤ
-0.75
bm
-0.71
strument
-0.66
20439
-0.66
la
-0.65
dstg
-0.63
bour
-0.61
Els
-0.61
ãĥ¬
-0.61
mx
-0.61
POSITIVE LOGITS
reverted
0.81
retains
0.80
dodged
0.80
responded
0.79
invests
0.79
opted
0.78
reacted
0.78
countered
0.77
certainly
0.76
cannot
0.76
Activations Density 0.436%