INDEX
Explanations
mentions of Muslims and related topics, including various forms of identity and community presence
New Auto-Interp
Negative Logits
Ĥ¹
-0.16
便
-0.15
esk
-0.15
ffffffff
-0.15
.ua
-0.14
Sparks
-0.14
edere
-0.13
atern
-0.13
"crypto
-0.13
AZY
-0.13
POSITIVE LOGITS
sWith
0.14
addslashes
0.14
faith
0.14
meli
0.14
orio
0.14
sla
0.14
-Owned
0.13
utton
0.13
ves
0.13
aph
0.13
Activations Density 0.011%