INDEX
Explanations
content related to political statements and government actions
New Auto-Interp
Head Attr Weights
0:0.05
1:0.16
2:0.04
3:0.04
4:0.02
5:0.05
6:0.10
7:0.11
8:0.03
9:0.04
10:0.21
11:0.09
Negative Logits
etsy
-1.33
someday
-1.29
Favorite
-1.27
��
-1.25
Kids
-1.24
Lovecraft
-1.23
Glow
-1.22
||||
-1.21
NYC
-1.21
gravity
-1.20
POSITIVE LOGITS
ceasefire
1.76
Kazakh
1.68
Ankara
1.67
Erdogan
1.66
Abbas
1.66
Rohingya
1.64
Lavrov
1.64
Nato
1.63
Tayyip
1.61
Sudan
1.60
Activations Density 1.127%