INDEX
Explanations
references to influential figures and their actions or opinions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.34
2:0.07
3:0.03
4:0.03
5:0.06
6:0.08
7:0.07
8:0.06
9:0.09
10:0.06
11:0.06
Negative Logits
etheless
-1.87
theless
-1.69
entimes
-1.59
xiety
-1.53
backbone
-1.51
ensional
-1.49
commit
-1.48
whatever
-1.48
mosqu
-1.48
proport
-1.46
POSITIVE LOGITS
(@
1.83
�
1.75
[+
1.68
·
1.67
*)
1.64
Cock
1.62
】
1.48
Scott
1.47
Kendall
1.43
Raven
1.40
Activations Density 0.100%