INDEX
Explanations
titles or ranks related to military or governmental positions
New Auto-Interp
Negative Logits
eker
-0.07
aeda
-0.07
enas
-0.07
meden
-0.06
uyu
-0.06
lug
-0.06
adesh
-0.06
taÅŁ
-0.06
ousse
-0.06
رات
-0.06
POSITIVE LOGITS
840
0.07
354
0.06
Favor
0.06
باست
0.06
اÙĨÙĩ
0.06
Rum
0.06
417
0.06
839
0.06
872
0.06
æİĮ
0.06
Activations Density 0.004%