INDEX
    Explanations

    first-person pronouns and expressions of uncertainty or speculation

    New Auto-Interp
    Negative Logits
    ãĥ«ãĥķ
    -0.14
    adx
    -0.14
    acey
    -0.14
    @show
    -0.14
    524
    -0.14
     èµ·
    -0.13
    hood
    -0.13
    ies
    -0.13
    åŀ
    -0.13
    IVED
    -0.13
    POSITIVE LOGITS
     don
    0.60
    don
    0.50
     Don
    0.44
     doesn
    0.44
     dont
    0.43
    Don
    0.43
     DON
    0.42
    ä¸įçŁ¥éģĵ
    0.39
     Dun
    0.36
     dun
    0.36
    Act Density 0.045%

    No Known Activations