INDEX
Explanations
references to the Islamic deity "Allah" or related terms
references to Islamic religious texts and figures
New Auto-Interp
Negative Logits
ocamp
-0.79
ersen
-0.72
Wilmington
-0.69
Vaugh
-0.68
downs
-0.63
grounds
-0.62
linux
-0.62
ENS
-0.62
Grimes
-0.61
Bellev
-0.60
POSITIVE LOGITS
Ùİ
1.04
ibn
1.03
Ø
1.02
ÙIJ
1.00
abad
0.98
Allaah
0.98
ÙĴ
0.96
Ibn
0.94
ÙĨ
0.94
ر
0.91
Activations Density 0.052%