INDEX
Explanations
mentions of cults or cult-like followings
references to cults and their influence in various contexts
New Auto-Interp
Negative Logits
forth
-0.73
deen
-0.71
Drawn
-0.62
lag
-0.61
Lag
-0.57
cknowled
-0.57
Jav
-0.57
Stra
-0.56
Turk
-0.56
Kay
-0.56
POSITIVE LOGITS
ivated
1.10
ivating
1.08
urally
1.04
ivation
1.02
ists
0.98
ogenic
0.96
ura
0.90
ures
0.89
urable
0.89
etic
0.87
Activations Density 0.035%