INDEX
Explanations
discussions related to gender representation in media, particularly focusing on female characters and leads
New Auto-Interp
Negative Logits
adder
-0.15
edback
-0.15
RuntimeObject
-0.14
سر
-0.14
едини
-0.14
gere
-0.13
MOTE
-0.13
.ta
-0.13
ænd
-0.12
ÑĤаж
-0.12
POSITIVE LOGITS
characters
0.57
character
0.54
hero
0.46
characters
0.45
Characters
0.44
protagonist
0.43
lead
0.43
main
0.43
character
0.42
Character
0.40
Activations Density 0.289%