INDEX
Explanations
numerical values related to quantities or counts
New Auto-Interp
Negative Logits
481
-0.15
bow
-0.14
see
-0.14
zd
-0.14
Various
-0.14
thing
-0.14
598
-0.14
various
-0.13
either
-0.13
appropriate
-0.13
POSITIVE LOGITS
isd
0.15
lava
0.15
aghetti
0.15
.lazy
0.15
ards
0.14
erdale
0.14
utta
0.14
coli
0.14
yla
0.14
din
0.14
Activations Density 0.145%