INDEX
Explanations
references to sports teams or institutions
references to specific sports teams and universities
New Auto-Interp
Negative Logits
terday
-1.02
DonaldTrump
-0.98
imore
-0.95
schild
-0.90
Seym
-0.80
sembly
-0.79
apons
-0.78
orer
-0.76
colm
-0.75
etsy
-0.74
POSITIVE LOGITS
neys
1.00
Cou
0.96
gars
0.80
ney
0.78
NEY
0.74
Arabia
0.72
ndra
0.72
Ducks
0.71
resy
0.70
DVD
0.70
Activations Density 0.045%