INDEX
Explanations
names of individuals and their affiliations within an academic context
New Auto-Interp
Negative Logits
utow
-0.15
Musk
-0.14
ÙĩÙĨد
-0.14
اÙĩÙħ
-0.14
yectos
-0.14
ساÙĦÙħ
-0.14
lw
-0.14
phin
-0.14
abox
-0.14
ØŃاد
-0.13
POSITIVE LOGITS
Mehr
0.28
Pour
0.28
pour
0.26
Hos
0.25
Pour
0.25
Beh
0.24
nia
0.24
Tehran
0.23
ollah
0.23
Pey
0.22
Activations Density 0.068%