INDEX
Explanations
phrases indicating inquiry or seeking information
New Auto-Interp
Negative Logits
achi
-0.17
.Proxy
-0.17
jÃł
-0.15
enu
-0.14
NET
-0.14
monds
-0.13
iei
-0.13
voks
-0.13
'\''
-0.13
inki
-0.13
POSITIVE LOGITS
whether
0.20
uil
0.17
why
0.16
about
0.16
whether
0.16
how
0.15
ENER
0.15
θι
0.14
.jdesktop
0.14
_about
0.14
Activations Density 0.048%