INDEX
    Explanations

    expressions of concern and apathy towards others and their feelings

    New Auto-Interp
    Negative Logits
    anner
    -0.19
    کارÛĮ
    -0.15
    ersen
    -0.15
    arest
    -0.14
    ova
    -0.14
     ServiceProvider
    -0.14
    itude
    -0.14
    ader
    -0.14
    IDDLE
    -0.14
    reu
    -0.14
    POSITIVE LOGITS
    mue
    0.16
    ós
    0.16
    endir
    0.15
     whether
    0.14
    .chk
    0.14
    fur
    0.14
    agus
    0.13
    indir
    0.13
    ingly
    0.13
    죽
    0.13
    Act Density 0.040%

    No Known Activations