INDEX
Explanations
references to women in various contexts
New Auto-Interp
Negative Logits
ustimmung
-0.70
ScopeManager
-0.67
Boulogne
-0.65
Mille
-0.65
Naidu
-0.64
enumi
-0.63
RIAA
-0.62
emailAlready
-0.62
Combien
-0.61
charlotte
-0.60
POSITIVE LOGITS
s
0.78
womens
0.70
mens
0.68
omens
0.67
Mens
0.65
childrens
0.62
Womens
0.60
iseks
0.59
womens
0.59
AddTagHelper
0.55
Activations Density 0.113%