INDEX
Explanations
names of notable individuals or figures, particularly focusing on Islamic leaders or personalities
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.04
3:0.06
4:0.04
5:0.04
6:0.44
7:0.05
8:0.04
9:0.06
10:0.07
11:0.06
Negative Logits
ngth
-1.68
ources
-1.45
BLIC
-1.40
inarily
-1.30
dilig
-1.24
emort
-1.20
JUSTICE
-1.19
agric
-1.19
idity
-1.17
incial
-1.16
POSITIVE LOGITS
ー�
1.48
iru
1.39
chenko
1.35
ou
1.32
coni
1.31
roo
1.28
oud
1.26
oul
1.26
ミ
1.25
Barbie
1.24
Activations Density 0.003%