INDEX
Explanations
proper nouns or specific names
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.04
3:0.05
4:0.05
5:0.04
6:0.47
7:0.03
8:0.04
9:0.07
10:0.07
11:0.05
Negative Logits
coat
-1.53
Cheong
-1.35
iquette
-1.21
Brush
-1.21
bluff
-1.20
iners
-1.20
ILA
-1.18
hower
-1.18
icles
-1.17
ecause
-1.16
POSITIVE LOGITS
ilib
1.45
ét
1.41
pees
1.41
MpServer
1.35
escal
1.33
Galaxy
1.31
ゼウス
1.31
encia
1.28
pez
1.28
Wein
1.27
Activations Density 0.004%