INDEX
    Explanations

    pronouns and possessive determiners referring to people

    pronouns and their associated references to individuals

    New Auto-Interp
    Negative Logits
    haus
    -0.85
    gap
    -0.79
    GAN
    -0.75
    Ïĥ
    -0.74
    tree
    -0.74
    TABLE
    -0.73
    ij士
    -0.73
    river
    -0.72
    ÏĦ
    -0.71
    Slot
    -0.70
    POSITIVE LOGITS
     own
    1.42
     willingness
    1.22
     inability
    1.17
     penchant
    1.11
     entire
    1.03
     favourite
    1.03
     consequ
    1.00
     unwillingness
    0.99
     propensity
    0.98
     susceptibility
    0.94
    Act Density 0.096%

    No Known Activations