INDEX
Explanations
references to photographs and visual documentation
New Auto-Interp
Head Attr Weights
0:0.06
1:0.07
2:0.05
3:0.08
4:0.03
5:0.06
6:0.26
7:0.05
8:0.09
9:0.06
10:0.10
11:0.04
Negative Logits
issance
-1.60
tested
-1.51
discrimination
-1.50
ModLoader
-1.41
psychiat
-1.40
utical
-1.34
Reincarn
-1.32
proven
-1.32
reincarn
-1.32
bringing
-1.31
POSITIVE LOGITS
(@
1.68
]."
1.64
)</
1.49
IMAGES
1.44
)."
1.33
]"
1.32
?]
1.32
url
1.30
ubb
1.29
.","
1.28
Activations Density 0.039%