INDEX
Explanations
references to religious figures, particularly priests
mentions of the term "priest" and associated contexts
New Auto-Interp
Negative Logits
DonaldTrump
-0.82
BAT
-0.74
Flavoring
-0.68
oday
-0.67
RFC
-0.66
OX
-0.63
TING
-0.63
issors
-0.63
ilater
-0.63
ggles
-0.62
POSITIVE LOGITS
esses
1.59
ess
1.11
priests
1.05
priest
0.96
osate
0.88
ordained
0.87
mares
0.78
ificial
0.77
hood
0.75
exorc
0.74
Activations Density 0.014%