INDEX
Explanations
phrases related to criticism and rebuke
New Auto-Interp
Negative Logits
gins
-0.74
nex
-0.71
agate
-0.69
pleted
-0.69
iza
-0.67
expires
-0.67
enza
-0.65
lude
-0.65
strings
-0.63
ain
-0.63
POSITIVE LOGITS
environmentalists
1.20
critics
1.18
commentators
1.04
conservatives
1.03
pundits
1.03
libertarians
1.01
economists
1.00
detractors
0.98
feminists
0.96
commenters
0.96
Activations Density 0.181%