INDEX
    Explanations

    expressions of urgency and importance in advice or life decisions

    New Auto-Interp
    Negative Logits
    yll
    -0.17
    edar
    -0.16
    elow
    -0.15
    á»ĩ
    -0.14
    iazza
    -0.14
    ube
    -0.13
     Deniz
    -0.13
    ITCH
    -0.13
    zel
    -0.13
    HashCode
    -0.13
    POSITIVE LOGITS
     vip
    0.15
    vip
    0.15
    nz
    0.14
    stroy
    0.14
    pur
    0.14
    .createFrom
    0.14
    plevel
    0.14
     people
    0.14
    NN
    0.13
     Stam
    0.13
    Act Density 0.001%

    No Known Activations