INDEX
Explanations
words related to conspiracy theories and controversial topics
New Auto-Interp
Negative Logits
constellation
-1.01
Hawaiian
-0.96
acquisitions
-0.94
LCS
-0.93
Iceland
-0.93
llan
-0.90
rings
-0.89
ADS
-0.89
Nigerian
-0.87
Roma
-0.87
POSITIVE LOGITS
iest
1.55
ier
1.53
iers
1.49
estate
1.45
engers
1.42
uers
1.40
iness
1.39
erness
1.38
arella
1.36
orship
1.35
Activations Density 1.356%