INDEX
Explanations
keywords related to criticism and controversy
mentions of advocacy, criticism, and political reactions regarding social issues
New Auto-Interp
Negative Logits
shalt
-0.67
reproduction
-0.63
giveaway
-0.58
rect
-0.57
operates
-0.57
misdem
-0.55
grace
-0.55
ãĥĺ
-0.54
survived
-0.54
supplementary
-0.54
POSITIVE LOGITS
alike
1.21
who
1.03
who
0.90
ervatives
0.88
worried
0.88
concerned
0.82
appalled
0.82
wary
0.79
alarmed
0.78
skeptical
0.78
Activations Density 0.322%