INDEX
Explanations
specific phrases or patterns in transcribed or mixed language text
New Auto-Interp
Negative Logits
ved
-0.15
bedo
-0.14
lek
-0.14
æŁĵ
-0.14
ranks
-0.14
iev
-0.14
æ¢
-0.14
šlo
-0.13
yun
-0.13
ting
-0.13
POSITIVE LOGITS
u
0.24
sa
0.22
koje
0.19
koji
0.19
ko
0.18
Ñĥ
0.18
ко
0.18
,u
0.17
(u
0.17
odak
0.16
Activations Density 0.007%