INDEX
Explanations
words related to politics and government
references to political groups and their associated figures
New Auto-Interp
Negative Logits
Beaut
-0.68
mechanic
-0.64
conqu
-0.62
beaut
-0.60
cleaner
-0.58
egu
-0.57
Mistress
-0.57
utters
-0.57
Mama
-0.56
mosa
-0.55
POSITIVE LOGITS
alike
1.26
ervatives
0.93
regarding
0.92
condemning
0.90
concerning
0.87
who
0.87
concerned
0.86
alarmed
0.84
who
0.84
whom
0.84
Activations Density 0.385%