INDEX
Explanations
controversial or negative events involving individuals
references to incidents or actions that involve legal or criminal matters
New Auto-Interp
Negative Logits
tremend
-0.75
eleph
-0.66
newcom
-0.66
sucker
-0.65
lest
-0.61
propri
-0.60
favors
-0.58
goalt
-0.58
newsp
-0.58
endeavor
-0.57
POSITIVE LOGITS
Updated
1.02
Posted
1.00
hello
0.91
Britain
0.89
³³³
0.84
Gab
0.76
³³³³³³³³
0.76
³³³³³³³³³³³³³³³³
0.75
TW
0.75
Bus
0.72
Activations Density 0.071%