INDEX
Explanations
expressions or statements of disapproval or condemnation
occurrences of the word "denounce" and its variations
New Auto-Interp
Negative Logits
olen
-0.87
omaly
-0.83
ramid
-0.83
ammy
-0.78
cially
-0.78
aker
-0.76
perm
-0.73
icum
-0.72
OVA
-0.72
thora
-0.71
POSITIVE LOGITS
denouncing
0.98
denounce
0.89
denounced
0.81
loudly
0.80
disav
0.74
aloud
0.73
urous
0.72
racism
0.71
condemning
0.69
homophobia
0.68
Activations Density 0.013%