INDEX
Explanations
phrases related to speaking out or expressing opinions
references to voices and expressions from various groups or individuals, particularly in discussions of representation and inequality
New Auto-Interp
Negative Logits
amily
-0.70
onut
-0.69
hib
-0.69
Kear
-0.69
etheless
-0.67
sis
-0.67
addy
-0.65
keyes
-0.64
slaught
-0.62
athing
-0.62
POSITIVE LOGITS
voices
1.03
recorder
1.02
voice
1.00
voice
0.98
louder
0.85
Voice
0.79
melody
0.76
holder
0.76
heard
0.74
chorus
0.73
Activations Density 0.016%