INDEX
    Explanations

    feminine singular pronouns

    New Auto-Interp
    Negative Logits
     himself
    -2.25
     his
    -1.87
     esso
    -1.67
     jego
    -1.54
     he
    -1.51
     celui
    -1.49
     usarlo
    -1.47
     его
    -1.44
     který
    -1.38
    himself
    -1.38
    POSITIVE LOGITS
     herself
    3.69
     she
    3.44
     celle
    2.89
     она
    2.83
     которой
    2.80
     her
    2.78
     которая
    2.67
     laquelle
    2.63
     quella
    2.55
    herself
    2.52
    Act Density 0.041%

    No Known Activations