INDEX
    Explanations

    occurrences of personal pronouns

    New Auto-Interp
    Negative Logits
    UGE
    -0.16
    êt
    -0.16
    IRM
    -0.14
    uiten
    -0.14
    eniable
    -0.14
     ÄĮer
    -0.14
    ANGER
    -0.14
     firm
    -0.14
    firm
    -0.14
    coma
    -0.14
    POSITIVE LOGITS
     demand
    0.23
     hereby
    0.20
     mean
    0.19
     haz
    0.19
     bet
    0.18
     better
    0.18
     promise
    0.17
     Demand
    0.17
     demands
    0.17
     kid
    0.16
    Act Density 0.329%

    No Known Activations