INDEX
Explanations
discussions about societal norms and stereotypes related to gender and parenting
New Auto-Interp
Negative Logits
متعلقه
-0.91
nakalista
-0.88
Roskov
-0.87
Bronnen
-0.64
InvalidProtocol
-0.61
estekak
-0.60
DeleteBehavior
-0.60
Begriffsklä
-0.60
TestingModule
-0.57
linkovi
-0.57
POSITIVE LOGITS
stereotypes
1.20
stereotype
1.10
societal
1.05
stereotyp
1.02
norms
0.96
stereotypical
0.94
perceptions
0.94
society
0.94
perception
0.93
assumptions
0.92
Activations Density 0.443%