INDEX
Explanations
references to women and their achievements, particularly in leadership and historically significant roles
New Auto-Interp
Negative Logits
iker
-0.15
Gut
-0.15
Separated
-0.15
acher
-0.14
iphy
-0.14
istar
-0.14
andon
-0.14
空
-0.14
urgeon
-0.14
-style
-0.13
POSITIVE LOGITS
mutable
0.15
ê¶Į
0.15
Ness
0.15
ERRU
0.14
isco
0.14
enthusi
0.14
iat
0.14
Äįet
0.14
à¹Īาà¸ĩ
0.14
OKEN
0.14
Activations Density 0.083%