INDEX
Explanations
phrases encouraging communication or interaction
New Auto-Interp
Negative Logits
ritis
-0.15
umba
-0.14
inning
-0.14
native
-0.14
erras
-0.13
nob
-0.13
ÑĤÑĥ
-0.13
èĽ
-0.13
mpl
-0.13
ÙĨ
-0.13
POSITIVE LOGITS
.fre
0.18
esine
0.17
freely
0.17
Wet
0.14
assen
0.14
ìŀIJìľł
0.14
use
0.14
any
0.14
652
0.14
anytime
0.14
Activations Density 0.021%