INDEX
Explanations
instances of the word "more."
New Auto-Interp
Negative Logits
dn
-0.17
eln
-0.17
AGO
-0.16
oval
-0.15
izer
-0.15
eyle
-0.14
arus
-0.13
ä»ĺãģij
-0.13
ushi
-0.13
rette
-0.13
POSITIVE LOGITS
about
0.26
information
0.22
about
0.20
info
0.19
detail
0.18
information
0.18
ways
0.18
details
0.17
reason
0.17
información
0.16
Activations Density 0.012%