INDEX
Explanations
quantities and numerical references within the text
New Auto-Interp
Negative Logits
eter
-0.16
otope
-0.15
esel
-0.14
nist
-0.13
usercontent
-0.13
s
-0.13
ertas
-0.13
bau
-0.13
imas
-0.13
etter
-0.13
POSITIVE LOGITS
-dimensional
0.20
-thirds
0.18
dozen
0.17
-way
0.16
ancy
0.15
instein
0.15
/to
0.14
lava
0.14
-digit
0.14
agers
0.14
Activations Density 0.183%