INDEX
Explanations
numerical data and statistics in the text
New Auto-Interp
Negative Logits
idth
-0.15
anst
-0.14
ÑĪкÑĥ
-0.13
lož
-0.13
/trunk
-0.13
Dub
-0.13
098
-0.13
dub
-0.12
Ì£
-0.12
Veg
-0.12
POSITIVE LOGITS
pp
0.26
pp
0.22
pages
0.21
Pages
0.18
Pages
0.18
pages
0.17
pp
0.17
PP
0.15
vi
0.15
pps
0.15
Activations Density 0.089%