INDEX
Explanations
references to female individuals
New Auto-Interp
Negative Logits
minimum
-0.61
aix
-0.61
ικα
-0.60
dui
-0.60
ĩ
-0.59
Udo
-0.58
rati
-0.58
Landis
-0.57
icei
-0.57
dai
-0.57
POSITIVE LOGITS
she
1.31
She
1.31
She
1.19
she
1.17
SHE
1.05
SHE
1.04
herself
1.00
shes
0.95
shein
0.94
he
0.94
Activations Density 0.104%