INDEX
Explanations
words related to authority and dissent in political contexts
New Auto-Interp
Negative Logits
anwhile
-0.69
utical
-0.69
lapse
-0.66
doms
-0.65
afety
-0.64
parcels
-0.64
ripe
-0.62
iguous
-0.62
hern
-0.61
sylv
-0.61
POSITIVE LOGITS
issance
1.02
wagen
0.71
Wagner
0.69
izzle
0.69
spective
0.68
invoke
0.68
Cathedral
0.67
ctive
0.67
Rollins
0.66
enment
0.66
Activations Density 0.063%