INDEX
Explanations
references to historical practices and concepts involving gender roles and societal structures
New Auto-Interp
Negative Logits
purpoſe
-0.69
himſelf
-0.63
myſelf
-0.61
Monfieur
-0.61
quæ
-0.61
Efq
-0.60
Jefus
-0.58
juſt
-0.58
transfieras
-0.55
houſe
-0.55
POSITIVE LOGITS
曖昧さ回避
0.53
Biblia
0.49
ertale
0.48
Citations
0.48
raisers
0.48
SGS
0.48
CardBody
0.47
mum
0.46
yaar
0.46
']).
0.46
Activations Density 0.514%