INDEX
Explanations
controversial topics or issues
instances of the word "controversial" and its variations
New Auto-Interp
Negative Logits
vation
-0.82
á
-0.81
ynthesis
-0.79
elsen
-0.76
©¶æ
-0.75
urance
-0.72
abetic
-0.72
united
-0.70
ovember
-0.69
ür
-0.68
POSITIVE LOGITS
aspects
0.83
topics
0.78
controversial
0.77
ity
0.76
fringe
0.75
iating
0.72
culprit
0.72
opinions
0.71
assumptions
0.70
viewpoints
0.70
Activations Density 0.038%