INDEX
Explanations
terms related to demagoguery and agitation
terms related to manipulation and potential deceit in media coverage
New Auto-Interp
Negative Logits
staking
-0.83
backer
-0.78
tracing
-0.74
ball
-0.69
Toll
-0.67
rem
-0.67
pora
-0.66
cos
-0.66
modelling
-0.64
traced
-0.64
POSITIVE LOGITS
glers
1.38
agog
1.16
agogue
1.10
ically
1.09
atory
1.05
acles
1.00
uments
0.98
ues
0.96
ibles
0.91
acle
0.90
Activations Density 0.016%