INDEX
Explanations
names that start with "Abdul" and have varying activations depending on the specific name
specific names of Middle Eastern political figures
New Auto-Interp
Negative Logits
kefeller
-0.90
sugg
-0.90
arteries
-0.72
acad
-0.71
emouth
-0.71
flix
-0.66
eport
-0.66
gears
-0.66
moth
-0.65
chrom
-0.65
POSITIVE LOGITS
ij士
0.84
Ń
0.84
ali
0.82
Kar
0.78
Rah
0.78
Amin
0.77
lah
0.77
irin
0.76
ÙĬ
0.75
Rahman
0.74
Activations Density 0.132%