INDEX
Explanations
data or statistics on various demographic groups
statements about gender differences and societal roles
New Auto-Interp
Negative Logits
Launch
-0.69
Operation
-0.64
ARA
-0.63
Skies
-0.63
Bezos
-0.63
Torrent
-0.63
launch
-0.62
DragonMagazine
-0.62
Explos
-0.59
rium
-0.57
POSITIVE LOGITS
biologically
1.03
socially
0.96
caregivers
0.95
culturally
0.91
careg
0.90
sexually
0.83
psychologically
0.82
poorer
0.81
wealthier
0.81
unmarried
0.81
Activations Density 1.281%