INDEX
    Explanations

    relationships and conversations involving obligation and emotional connections

    New Auto-Interp
    Negative Logits
     her
    -1.36
    Her
    -1.12
    her
    -1.01
     Her
    -0.98
     HER
    -0.92
     suaminya
    -0.76
     them
    -0.73
    them
    -0.73
    HER
    -0.72
     ihn
    -0.71
    POSITIVE LOGITS
     she
    2.73
    she
    1.73
    She
    1.55
     она
    1.42
     She
    1.33
     SHE
    1.30
     вона
    1.19
    SHE
    1.13
     he
    0.99
     fhe
    0.98
    Act Density 0.199%

    No Known Activations