INDEX
Explanations
references to a specific news outlet named "Express"
references to the term "Express"
New Auto-Interp
Negative Logits
ellen
-0.78
aeper
-0.77
Ambro
-0.71
kees
-0.69
STEM
-0.68
abama
-0.67
quartered
-0.66
Icar
-0.64
tackle
-0.63
SOURCE
-0.62
POSITIVE LOGITS
Route
1.12
ions
1.09
ivity
0.98
ional
0.89
Express
0.88
VPN
0.85
iveness
0.84
ivities
0.79
imity
0.78
igent
0.77
Activations Density 0.025%