INDEX
    Explanations

    reflexive pronouns and phrases indicating self-reference

    New Auto-Interp
    Negative Logits
     hunne
    -0.59
    rijke
    -0.58
     itinéraires
    -0.56
     Bani
    -0.56
    dars
    -0.55
     tiros
    -0.55
     Racine
    -0.53
    Pik
    -0.53
    ENOS
    -0.52
    ary
    -0.52
    POSITIVE LOGITS
    itself
    1.35
     itself
    1.32
     Itself
    1.24
     Roskov
    1.03
     himself
    1.00
     sendiri
    0.96
     Himself
    0.95
    himself
    0.90
    themselves
    0.89
     herself
    0.86
    Act Density 0.105%

    No Known Activations