INDEX
Explanations
structural components and formatting elements commonly used in written documents
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.08
3:0.05
4:0.10
5:0.03
6:0.38
7:0.03
8:0.05
9:0.04
10:0.10
11:0.05
Negative Logits
ら
-1.19
Guards
-1.19
greet
-1.18
veland
-1.18
Jae
-1.18
jog
-1.17
Kare
-1.16
Nir
-1.14
Airl
-1.14
KC
-1.11
POSITIVE LOGITS
psy
1.38
predecessor
1.36
latest
1.29
stretched
1.28
Queen
1.26
contained
1.23
iosity
1.23
going
1.23
independence
1.22
gettable
1.21
Activations Density 0.117%