INDEX
Explanations
mentions of political figures and governmental actions
New Auto-Interp
Negative Logits
finder
-0.80
Craigslist
-0.80
uploads
-0.78
igslist
-0.74
miscarriage
-0.71
archive
-0.69
Reviewer
-0.69
icide
-0.69
usercontent
-0.69
quality
-0.67
POSITIVE LOGITS
remarks
1.23
reiterated
1.16
emphas
1.15
praised
1.10
spoke
1.08
stressed
1.06
reiter
1.05
reiterate
1.05
Speaking
1.01
stressing
1.00
Activations Density 0.429%