INDEX
Explanations
phrases related to criticism or condemned actions
mentions of criticism directed by individuals or groups
New Auto-Interp
Negative Logits
lude
-0.72
redistributed
-0.69
Done
-0.65
process
-0.65
nex
-0.63
agate
-0.63
gotta
-0.62
feat
-0.62
bis
-0.62
done
-0.60
POSITIVE LOGITS
critics
1.04
commentators
1.04
pundits
1.04
passers
1.00
environmentalists
0.96
onlook
0.93
economists
0.89
STATS
0.89
detractors
0.86
conservatives
0.86
Activations Density 0.210%