INDEX
Explanations
mentions of cults and cult-like followings
references to cults and cult-related concepts
New Auto-Interp
Negative Logits
forth
-0.75
etsk
-0.73
sole
-0.67
balance
-0.66
etter
-0.65
ptroller
-0.61
Downloadha
-0.61
Stra
-0.60
CO
-0.59
Fowler
-0.59
POSITIVE LOGITS
ivating
1.41
ivated
1.32
urally
1.26
ivation
1.24
ures
1.03
ishly
0.99
ured
0.99
ura
0.98
atical
0.97
urable
0.94
Activations Density 0.047%