INDEX
Explanations
mentions of prayer and religious belief, particularly related to Islam
references to religious practices or traditions
New Auto-Interp
Negative Logits
ymm
-0.72
irection
-0.69
GW
-0.65
Pog
-0.65
oret
-0.65
-0.63
Publisher
-0.61
Gamble
-0.61
Emer
-0.61
aba
-0.59
POSITIVE LOGITS
respectively
1.10
goddamn
0.85
fuckin
0.83
fucking
0.81
))))
0.80
NetMessage
0.79
blah
0.79
shit
0.78
!".
0.77
etc
0.76
Activations Density 1.655%