INDEX
Explanations
references to online content, particularly links and sources
New Auto-Interp
Head Attr Weights
0:0.04
1:0.01
2:0.14
3:0.11
4:0.20
5:0.03
6:0.04
7:0.13
8:0.04
9:0.03
10:0.08
11:0.10
Negative Logits
��
-1.54
ォ
-1.49
etheless
-1.40
龍�
-1.34
pires
-1.33
バ
-1.32
Marginal
-1.32
theless
-1.31
ディ
-1.31
�
-1.30
POSITIVE LOGITS
ibliography
1.58
publications
1.55
transcripts
1.54
respective
1.51
contacts
1.51
etc
1.49
websites
1.44
landmarks
1.41
Pastebin
1.38
etc
1.38
Activations Density 0.012%