INDEX
Explanations
references to political contexts and terms
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.05
3:0.06
4:0.05
5:0.04
6:0.45
7:0.04
8:0.05
9:0.06
10:0.07
11:0.05
Negative Logits
spoilers
-1.26
------------------------------------------------
-1.21
wonder
-1.18
specificity
-1.17
comprehension
-1.16
ension
-1.12
consideration
-1.12
Winter
-1.11
olesterol
-1.11
ended
-1.08
POSITIVE LOGITS
geist
1.62
neau
1.45
��
1.39
aceae
1.35
zyk
1.35
�
1.33
clinton
1.31
ça
1.28
wagen
1.26
veland
1.25
Activations Density 0.003%