INDEX
Explanations
references to condemnation and criticism of violent or discriminatory actions
New Auto-Interp
Negative Logits
arguments
-0.19
Arguments
-0.19
arguing
-0.17
controversial
-0.16
reve
-0.15
Arguments
-0.15
contentious
-0.15
Argument
-0.15
complaining
-0.15
arguments
-0.14
POSITIVE LOGITS
condemn
0.37
condemnation
0.35
-cond
0.35
condemned
0.33
condem
0.33
cond
0.32
condemning
0.30
cond
0.30
solidarity
0.28
Cond
0.28
Activations Density 0.118%