INDEX
Explanations
terms related to political party affiliations, specifically focusing on Republicans and judges
New Auto-Interp
Negative Logits
-0.50
fully
-0.49
re
-0.48
first
-0.48
initial
-0.47
realized
-0.47
subsequent
-0.47
set
-0.47
program
-0.47
sub
-0.45
POSITIVE LOGITS
Republican
1.22
Republican
1.11
Republicans
1.06
Republicans
1.05
republic
1.03
REPUBLIC
0.94
republicans
0.91
Republic
0.90
republican
0.87
republic
0.86
Activations Density 0.208%