INDEX
Explanations
mentions of political figures and events related to governance or legislation
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.06
3:0.50
4:0.05
5:0.04
6:0.02
7:0.03
8:0.04
9:0.04
10:0.05
11:0.07
Negative Logits
�
-2.08
luaj
-1.99
��極
-1.91
��
-1.88
cffffcc
-1.75
��
-1.70
enthusi
-1.68
機
-1.67
Secondly
-1.66
SourceFile
-1.62
POSITIVE LOGITS
?'
1.64
?:
1.62
ldon
1.54
weeney
1.43
?」
1.41
Paraly
1.39
rador
1.38
ucha
1.36
reacts
1.36
?'"
1.36
Activations Density 0.011%