INDEX
Explanations
phrases or words related to strong disapproval or criticism
instances of the word "condemned."
New Auto-Interp
Negative Logits
thora
-0.93
ovember
-0.79
vati
-0.78
weeney
-0.76
opic
-0.73
ramid
-0.71
athering
-0.70
hack
-0.70
OTE
-0.69
nown
-0.69
POSITIVE LOGITS
condemned
0.92
condemning
0.82
condemn
0.82
condemns
0.79
harshly
0.78
utsche
0.77
condem
0.75
criticised
0.69
critic
0.68
Survivors
0.68
Activations Density 0.009%