INDEX
Explanations
the presence of the letters "sp" in words
New Auto-Interp
Negative Logits
MetroFramework
-0.15
usters
-0.15
clipse
-0.15
usted
-0.14
ÛĮست
-0.14
campo
-0.14
YPE
-0.14
éł
-0.13
veis
-0.13
خت
-0.13
POSITIVE LOGITS
sp
0.36
Sp
0.28
-sp
0.26
Sp
0.24
.Sp
0.21
/sp
0.20
sp
0.19
.sp
0.18
forth
0.18
spur
0.17
Activations Density 0.017%