INDEX
    Explanations

    references to confidence and self-esteem

    New Auto-Interp
    Negative Logits
     Ber
    -0.71
    gdx
    -0.69
    к
    -0.64
    м
    -0.63
    asley
    -0.62
    sel
    -0.62
     berk
    -0.61
     tark
    -0.60
     ber
    -0.60
     Brink
    -0.60
    POSITIVE LOGITS
     confidence
    1.74
    confidence
    1.65
     Confidence
    1.64
     confident
    1.56
    Confidence
    1.47
    confident
    1.46
     confidently
    1.35
     myſelf
    1.28
     confiance
    1.22
     itſelf
    1.21
    Act Density 0.056%

    No Known Activations