INDEX
Explanations
references to groups or movements characterized by strong devotion or following
references to "cult" and its variations
New Auto-Interp
Negative Logits
forth
-0.76
deen
-0.66
Downloadha
-0.64
Ake
-0.62
Stra
-0.62
FORE
-0.62
Bloomberg
-0.61
OB
-0.61
lihood
-0.61
berry
-0.60
POSITIVE LOGITS
ivating
1.23
ivated
1.20
ivation
1.19
urally
1.14
ures
1.07
ists
1.04
urable
0.98
ura
0.96
ogenic
0.96
ured
0.93
Activations Density 0.050%