INDEX
Explanations
words related to disagreement or opposition
references to dissent and opposition to authority
New Auto-Interp
Negative Logits
ammy
-0.82
onz
-0.70
Grav
-0.65
onut
-0.65
ohyd
-0.64
ategory
-0.64
PATH
-0.63
ologne
-0.63
à¤
-0.62
ursed
-0.62
POSITIVE LOGITS
dissent
1.19
dissenting
0.98
ers
0.85
iates
0.84
dissidents
0.81
ible
0.78
bryce
0.77
guiActiveUn
0.76
rained
0.75
ership
0.74
Activations Density 0.009%