INDEX
Explanations
mentions of political positions and titles
references to "shadow" cabinet members or positions
New Auto-Interp
Negative Logits
urses
-0.92
ickr
-0.82
awaru
-0.80
olitan
-0.79
ktop
-0.75
orsi
-0.74
renheit
-0.74
anchester
-0.73
erto
-0.71
keye
-0.70
POSITIVE LOGITS
moon
0.93
shadow
0.84
Shadow
0.83
shadow
0.82
runners
0.81
boxing
0.79
loo
0.78
dust
0.76
runner
0.75
busters
0.75
Activations Density 0.013%