INDEX
Explanations
words related to the United States or its citizens
references to Americans or American demographics
New Auto-Interp
Negative Logits
cer
-0.67
camera
-0.67
Initialized
-0.67
operation
-0.65
bol
-0.65
Za
-0.64
Sys
-0.64
type
-0.64
Drag
-0.62
rous
-0.62
POSITIVE LOGITS
hip
0.90
'
0.88
living
0.86
who
0.81
residing
0.81
polled
0.79
ervative
0.79
aged
0.79
enrolled
0.78
ervatives
0.77
Activations Density 0.064%