INDEX
Explanations
names or references related to Middle Eastern individuals or locations
words related to geographic or cultural identities
New Auto-Interp
Negative Logits
glers
-0.81
gerald
-0.62
epid
-0.61
fighters
-0.58
smith
-0.57
kittens
-0.55
SHIP
-0.55
WARE
-0.54
decoration
-0.54
fighter
-0.53
POSITIVE LOGITS
ghan
0.81
angan
0.72
anmar
0.71
ahar
0.71
han
0.70
arat
0.68
az
0.67
ai
0.67
itia
0.67
ateg
0.66
Activations Density 0.203%