INDEX
    Explanations

    references to the self or self-related concepts

    New Auto-Interp
    Negative Logits
    AppCompatTheme
    -0.68
    openConnection
    -0.67
    UrlResolution
    -0.66
    corrhi
    -0.65
     hunne
    -0.64
     Wha
    -0.63
    plaatst
    -0.61
     automatiques
    -0.61
     vectorielles
    -0.61
    wüns
    -0.60
    POSITIVE LOGITS
     itself
    1.12
     Itself
    1.06
    itself
    1.02
     Roskov
    0.85
     herself
    0.84
    本身
    0.82
     himself
    0.80
     sendiri
    0.76
     Himself
    0.73
     themselves
    0.73
    Act Density 0.060%

    No Known Activations