INDEX
    Explanations

    references to female individuals

    New Auto-Interp
    Negative Logits
     minimum
    -0.61
     aix
    -0.61
    ικα
    -0.60
    dui
    -0.60
    ĩ
    -0.59
    Udo
    -0.58
     rati
    -0.58
     Landis
    -0.57
    icei
    -0.57
     dai
    -0.57
    POSITIVE LOGITS
     she
    1.31
    She
    1.31
     She
    1.19
    she
    1.17
     SHE
    1.05
    SHE
    1.04
     herself
    1.00
     shes
    0.95
    shein
    0.94
     he
    0.94
    Act Density 0.104%

    No Known Activations