INDEX
Explanations
phrases related to the Koch brothers
mentions of the Koch brothers
New Auto-Interp
Negative Logits
INST
-0.70
fa
-0.68
prep
-0.66
REC
-0.66
cs
-0.64
syn
-0.62
IMAGES
-0.62
Halo
-0.61
enc
-0.61
Mixed
-0.59
POSITIVE LOGITS
Koch
4.22
ALEC
1.39
Cato
1.38
Jindal
1.28
Koz
1.20
Rove
1.12
kefeller
1.09
Friedrich
1.09
Kuh
1.08
NRA
1.08
Activations Density 0.025%