INDEX
Explanations
phrases indicating movement or transitions
New Auto-Interp
Negative Logits
entire
-0.16
ãģıãģł
-0.15
olini
-0.15
andes
-0.14
AYS
-0.14
ona
-0.14
onas
-0.14
ÙĪØ¨Ø©
-0.13
Harbour
-0.13
vÄĽdom
-0.13
POSITIVE LOGITS
коз
0.15
another
0.14
detriment
0.14
ØŃÙĬØ«
0.14
noch
0.14
unch
0.14
ugg
0.14
uten
0.14
sis
0.13
vens
0.13
Activations Density 0.127%