INDEX
Explanations
references to male-dominant institutions and their activities
New Auto-Interp
Negative Logits
ennon
-0.16
ær
-0.15
cr
-0.15
omic
-0.15
Buccane
-0.15
anners
-0.14
FW
-0.14
Dr
-0.14
Challenger
-0.14
routine
-0.14
POSITIVE LOGITS
Princeton
0.25
Yale
0.24
Ezra
0.23
Ivy
0.23
Cornell
0.21
Residential
0.19
inceton
0.19
residential
0.19
concentr
0.18
Dart
0.17
Activations Density 0.030%