INDEX
    Explanations

    personal pronouns referring to a specific individual

    New Auto-Interp
    Negative Logits
     olx
    -1.24
     levis
    -1.22
     budapest
    -1.19
     fatis
    -1.18
     magis
    -1.16
     umo
    -1.16
     Juf
    -1.14
     tanong
    -1.14
     wien
    -1.13
     lele
    -1.12
    POSITIVE LOGITS
    He
    0.87
     He
    0.86
     he
    0.85
    he
    0.77
     didn
    0.75
     himself
    0.74
     did
    0.72
    She
    0.71
     hasn
    0.70
     She
    0.69
    Act Density 0.282%

    No Known Activations