INDEX
Explanations
references to saints or saintly figures
New Auto-Interp
Negative Logits
istan
-0.17
neau
-0.15
imers
-0.15
باب
-0.14
hee
-0.14
leep
-0.14
neas
-0.14
_MSK
-0.13
alth
-0.13
zbek
-0.13
POSITIVE LOGITS
تÛĮ
0.18
rop
0.18
tribute
0.17
Hav
0.17
urai
0.17
ioni
0.16
ione
0.15
omen
0.15
çĥĪ
0.15
igne
0.14
Activations Density 0.008%