INDEX
Explanations
URLs and web links in the text
New Auto-Interp
Head Attr Weights
0:0.08
1:0.05
2:0.13
3:0.05
4:0.01
5:0.04
6:0.05
7:0.11
8:0.04
9:0.03
10:0.04
11:0.32
Negative Logits
↵
-2.38
aution
-2.37
sufficient
-2.28
¶
-2.25
覚醒
-2.23
Parables
-2.15
�
-2.15
;;;;;;;;;;;;
-2.15
aditional
-2.10
exting
-2.08
POSITIVE LOGITS
://
2.23
�
2.17
ames
2.15
@
2.10
Oaks
2.04
_
1.99
Reid
1.97
@
1.94
kios
1.93
Town
1.92
Activations Density 0.004%