INDEX
Explanations
mentions of serious accusations or claims
references to allegations in various contexts
New Auto-Interp
Negative Logits
emaker
-0.69
wisely
-0.67
iol
-0.67
Plasma
-0.65
ete
-0.65
iet
-0.63
saved
-0.63
ðŁĻĤ
-0.63
tuned
-0.63
arius
-0.63
POSITIVE LOGITS
allegations
3.43
accusations
2.85
allegation
2.64
accusation
2.10
suspicions
1.87
revelations
1.84
assertions
1.83
claims
1.79
rumours
1.72
complaints
1.70
Activations Density 0.020%