INDEX
Explanations
references to Muslim-majority countries and their related contexts
New Auto-Interp
Negative Logits
ods
-0.17
sts
-0.16
elin
-0.15
ÏĥÏĦο
-0.15
isin
-0.15
ensis
-0.14
olves
-0.14
ialect
-0.14
olders
-0.14
jin
-0.14
POSITIVE LOGITS
-major
0.27
majority
0.26
-pop
0.21
major
0.20
leaning
0.19
Majority
0.18
dominated
0.18
populated
0.17
controlled
0.16
stronghold
0.16
Activations Density 0.053%