INDEX
Explanations
questions beginning with "How."
New Auto-Interp
Negative Logits
ttp
-0.17
noon
-0.15
olik
-0.15
éĽħ
-0.14
uegos
-0.14
emek
-0.14
ayment
-0.14
nesia
-0.14
engo
-0.14
onto
-0.14
POSITIVE LOGITS
ever
0.19
else
0.17
otherwise
0.17
ever
0.16
Otherwise
0.15
.Framework
0.15
-ever
0.15
Otherwise
0.15
else
0.15
203
0.15
Activations Density 0.056%