INDEX
Explanations
critical commentary on societal or political issues
New Auto-Interp
Head Attr Weights
0:0.03
1:0.14
2:0.14
3:0.03
4:0.02
5:0.04
6:0.07
7:0.16
8:0.15
9:0.06
10:0.06
11:0.04
Negative Logits
*)
-1.27
Malfoy
-1.18
Hug
-1.17
«
-1.16
Belg
-1.14
Haf
-1.12
Wing
-1.10
-1.10
Hung
-1.08
!)
-1.07
POSITIVE LOGITS
etheless
1.56
omo
1.43
erial
1.42
azeera
1.42
ital
1.39
omal
1.35
anyon
1.33
ritional
1.31
emale
1.30
oser
1.29
Activations Density 0.140%