INDEX
Explanations
names or references to individuals, particularly those with "Tina" or "Nina" in them
New Auto-Interp
Negative Logits
¤ij
-0.16
áte
-0.16
ازÙħ
-0.16
ýv
-0.15
anooga
-0.15
ülü
-0.15
wahl
-0.15
acyj
-0.14
.metamodel
-0.14
ttp
-0.14
POSITIVE LOGITS
o
0.20
uer
0.19
amate
0.17
emez
0.15
elli
0.15
les
0.15
دارÛĮ
0.15
is
0.15
res
0.14
stip
0.14
Activations Density 0.017%