INDEX
Explanations
references to historical figures and their educational backgrounds
New Auto-Interp
Negative Logits
Hassan
-0.18
iasi
-0.15
udur
-0.14
Äĥm
-0.14
Äĥr
-0.14
Abbas
-0.14
Ľi
-0.14
tÆ°á»Łng
-0.14
غÙĬر
-0.14
ılım
-0.13
POSITIVE LOGITS
al
0.18
á
0.16
Bust
0.16
sû
0.15
Shay
0.15
Hawai
0.15
á
0.15
Kit
0.15
Ä
0.14
æľĭ
0.14
Activations Density 0.040%