INDEX
Explanations
numerical information and dates
New Auto-Interp
Negative Logits
ewire
-0.17
lingen
-0.15
::::::
-0.15
ickey
-0.15
.cf
-0.15
enden
-0.14
emoc
-0.14
wow
-0.14
ureau
-0.13
amerate
-0.13
POSITIVE LOGITS
th
0.26
nd
0.18
st
0.18
rd
0.18
third
0.17
_nth
0.17
second
0.16
fourth
0.16
ly
0.15
ith
0.15
Activations Density 0.025%