INDEX
Explanations
HTML tags and formatting elements
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.10
3:0.09
4:0.07
5:0.08
6:0.09
7:0.09
8:0.08
9:0.05
10:0.08
11:0.07
Negative Logits
conservancy
-1.40
FTWARE
-1.34
urga
-1.30
GoldMagikarp
-1.23
acceler
-1.21
byss
-1.20
hement
-1.20
awa
-1.17
Footnote
-1.17
orious
-1.12
POSITIVE LOGITS
"]
1.57
][
1.50
]
1.47
]]
1.43
]-
1.35
']
1.29
widget
1.29
]=
1.29
];
1.29
]:
1.26
Activations Density 0.008%