INDEX
Explanations
mentions of the word "Si" and its variations
New Auto-Interp
Negative Logits
leo
-0.18
iye
-0.17
chein
-0.17
umbo
-0.16
pez
-0.16
bih
-0.16
ÑĨеÑĢ
-0.16
umper
-0.15
appare
-0.15
etti
-0.15
POSITIVE LOGITS
ames
0.29
erra
0.27
emens
0.26
enna
0.25
oux
0.25
empre
0.21
erras
0.21
ERR
0.19
erral
0.19
esta
0.19
Activations Density 0.009%