INDEX
Explanations
references to the reader or the concept of inclusivity
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.17
3:0.07
4:0.17
5:0.04
6:0.17
7:0.09
8:0.04
9:0.04
10:0.06
11:0.08
Negative Logits
Ping
-1.36
cription
-1.32
Initialized
-1.31
istent
-1.29
claimed
-1.28
Checks
-1.28
Round
-1.28
Seaf
-1.26
Mahjong
-1.26
phony
-1.26
POSITIVE LOGITS
attest
1.63
pse
1.61
guide
1.54
版
1.50
>:
1.49
asin
1.48
magnification
1.47
antage
1.46
bos
1.46
rooft
1.38
Activations Density 0.000%