INDEX
Explanations
words related to disagreement or opposition
expressions of opposition or disapproval
New Auto-Interp
Negative Logits
negie
-0.88
ammy
-0.84
oufl
-0.77
=~=~
-0.69
Redditor
-0.69
seed
-0.69
ardy
-0.67
ramid
-0.65
ashington
-0.64
agna
-0.64
POSITIVE LOGITS
vehemently
0.87
voc
0.82
onent
0.81
ICE
0.76
thereto
0.71
pport
0.71
stren
0.70
vigorously
0.69
prejudice
0.68
opposing
0.68
Activations Density 0.028%