INDEX
Explanations
phrases discussing gender representation and expectations in media and marketing contexts
New Auto-Interp
Negative Logits
Robbins
-0.20
Rob
-0.18
Rob
-0.17
Bob
-0.15
Bob
-0.15
تز
-0.15
CRC
-0.14
Crab
-0.14
ROC
-0.14
Strategy
-0.14
POSITIVE LOGITS
Beyond
0.20
epis
0.19
Cage
0.19
Detroit
0.19
Tell
0.19
Life
0.18
Beyond
0.17
episode
0.17
Dont
0.17
Mich
0.17
Activations Density 0.011%