INDEX
Explanations
references to numeric values and quantities
New Auto-Interp
Negative Logits
stuff
-0.15
Numerous
-0.15
eter
-0.15
Thousands
-0.15
許
-0.15
thousands
-0.14
s
-0.14
olls
-0.14
mates
-0.14
stuff
-0.14
POSITIVE LOGITS
fold
0.27
teenth
0.26
different
0.24
-legged
0.21
-dimensional
0.20
eenth
0.20
-digit
0.20
separate
0.19
dozen
0.19
additional
0.19
Activations Density 0.185%