INDEX
Explanations
significant political events and their responses
New Auto-Interp
Negative Logits
agos
-0.17
خرد
-0.16
æĹ
-0.15
Sly
-0.15
MÄĽst
-0.14
bz
-0.14
ĥ
-0.14
elay
-0.13
411
-0.13
artz
-0.13
POSITIVE LOGITS
reaction
0.16
rea
0.15
ÏįÏĦε
0.15
Reaction
0.15
Welch
0.15
utut
0.15
Gain
0.14
news
0.14
äd
0.14
immers
0.14
Activations Density 0.196%