INDEX
Explanations
numeric digits
instances of numerical values, specifically digits and terms associated with them
New Auto-Interp
Negative Logits
hire
-0.94
lain
-0.82
ouf
-0.77
Shield
-0.73
Study
-0.72
heres
-0.70
nd
-0.68
dep
-0.67
side
-0.66
lace
-0.65
POSITIVE LOGITS
digit
0.93
itial
0.90
osaurs
0.90
ized
0.89
digits
0.89
igr
0.87
ILCS
0.86
oded
0.85
digit
0.85
itized
0.84
Activations Density 0.014%