INDEX
Explanations
phrases related to political and controversial statements
New Auto-Interp
Negative Logits
creen
-0.82
ABE
-0.71
Tasman
-0.70
Belg
-0.69
wagen
-0.68
shroud
-0.67
Mirage
-0.67
iewicz
-0.67
destro
-0.66
Reprodu
-0.64
POSITIVE LOGITS
ª
1.33
ł
1.27
IJ
1.24
ij
1.17
¹
1.10
ı
1.08
Ĵ
1.07
«
1.02
¤
1.01
ľ
1.01
Activations Density 0.785%