INDEX
Explanations
actions or statements that are disapproved or criticized by authorities or organizations
instances of the word "condemned"
New Auto-Interp
Negative Logits
thora
-0.93
mology
-0.80
athering
-0.77
opic
-0.76
ovember
-0.76
omal
-0.74
ewater
-0.72
vati
-0.71
bernatorial
-0.70
cially
-0.70
POSITIVE LOGITS
condemned
1.09
condemning
0.86
condem
0.82
condemns
0.80
utsche
0.76
Osc
0.73
condemn
0.71
denounced
0.71
Survivors
0.68
Skinner
0.67
Activations Density 0.006%