INDEX
Explanations
the political party affiliations and states of various politicians
specific political party affiliations and identifiers
New Auto-Interp
Negative Logits
compilation
-0.68
ponies
-0.67
prompt
-0.63
gor
-0.63
stereotypes
-0.60
soundtrack
-0.58
bumper
-0.58
gements
-0.57
header
-0.57
prol
-0.56
POSITIVE LOGITS
Managing
0.73
director
0.73
Found
0.73
iffe
0.72
join
0.67
ociate
0.66
LLP
0.65
.,
0.65
hematic
0.64
olla
0.63
Activations Density 0.111%