INDEX
Explanations
political-related words, including names of politicians and specific political events
New Auto-Interp
Negative Logits
angular
-0.84
undai
-0.78
eon
-0.76
obyl
-0.76
rices
-0.72
amber
-0.71
razen
-0.69
opal
-0.67
ronic
-0.67
allery
-0.66
POSITIVE LOGITS
Rapids
0.91
Brewers
0.84
waukee
0.82
Madison
0.82
kers
0.81
Wisconsin
0.80
WI
0.79
Bucks
0.76
DL
0.74
stown
0.72
Activations Density 0.063%