INDEX
Explanations
proper nouns like names and organizations
New Auto-Interp
Head Attr Weights
0:0.07
1:0.03
2:0.30
3:0.12
4:0.11
5:0.06
6:0.03
7:0.02
8:0.07
9:0.07
10:0.06
11:0.02
Negative Logits
glers
-1.77
ankles
-1.57
enterprises
-1.44
desks
-1.39
hops
-1.35
poles
-1.34
levers
-1.31
bats
-1.30
cigars
-1.29
clubs
-1.27
POSITIVE LOGITS
querque
1.67
ット
1.58
Serv
1.53
roy
1.40
reau
1.28
roma
1.27
ーテ
1.24
────
1.20
cus
1.20
izo
1.18
Activations Density 0.027%