INDEX
Explanations
phrases indicating restrictions or limitations in communication
New Auto-Interp
Negative Logits
eyin
-0.07
zano
-0.07
argin
-0.07
seins
-0.07
ãģ¤ãģij
-0.07
zim
-0.06
ä¹³
-0.06
ccion
-0.06
ãģıãģł
-0.06
ãģ¤ãģ¶
-0.06
POSITIVE LOGITS
publicly
0.07
ussen
0.07
freely
0.07
ucher
0.07
nap
0.06
ccount
0.06
quot
0.06
ÑģамоÑģÑĤ
0.06
Bale
0.06
Gus
0.06
Activations Density 0.001%