INDEX
Explanations
references to proportions or fractions
New Auto-Interp
Head Attr Weights
0:0.01
1:0.02
2:0.10
3:0.25
4:0.01
5:0.03
6:0.09
7:0.04
8:0.03
9:0.08
10:0.07
11:0.22
Negative Logits
virt
-1.40
の�
-1.22
ロ
-1.21
tumblr
-1.17
ーティ
-1.15
imeo
-1.14
テ
-1.12
き
-1.11
RAW
-1.08
MT
-1.07
POSITIVE LOGITS
squared
1.26
anymore
1.24
abyte
1.16
commuting
1.14
apprehens
1.10
assium
1.09
nor
1.09
allotted
1.06
hindsight
1.04
pie
1.04
Activations Density 0.004%