INDEX
Explanations
repeated references to a female subject or pronouns associated with female entities
New Auto-Interp
Negative Logits
BrowserModule
-0.53
statechange
-0.49
raiſ
-0.49
Tinto
-0.49
uſed
-0.48
Fiat
-0.47
समीक्षाएं
-0.47
Fiat
-0.46
Numerade
-0.46
itſelf
-0.46
POSITIVE LOGITS
her
3.36
Her
2.97
HER
2.95
Her
2.91
HER
2.80
her
2.23
hers
1.65
heri
1.57
herit
1.52
Хер
1.48
Activations Density 0.289%