INDEX
Explanations
references to female characters and their relationships or attributes
Refers to a woman or her actions
female pronouns and related actions
New Auto-Interp
Negative Logits
MigrationBuilder
-0.43
僕が
-0.43
僕
-0.40
UnsafeEnabled
-0.40
buddies
-0.39
sividad
-0.39
})->
-0.39
choly
-0.38
Jonny
-0.38
zwa
-0.38
POSITIVE LOGITS
Obrigada
0.68
herself
0.67
lesbian
0.65
hendes
0.63
حياتها
0.60
Women
0.57
करती
0.57
businesswoman
0.56
🏻♀️
0.56
FormTagHelper
0.55
Activations Density 0.621%