INDEX
Explanations
proper nouns and specific terms, such as political parties, organizations, and specific individuals
phrases related to representation and group identity in a political context
New Auto-Interp
Negative Logits
bombs
-0.68
Rusty
-0.63
matt
-0.62
dazz
-0.61
Nikol
-0.61
iously
-0.60
enson
-0.59
ffe
-0.58
lest
-0.57
kered
-0.56
POSITIVE LOGITS
ICAN
0.76
soDeliveryDate
0.75
opian
0.69
essional
0.67
oin
0.66
hetti
0.66
hematic
0.66
oun
0.65
actly
0.65
POS
0.65
Activations Density 0.130%