INDEX
    Explanations

    expressions of emotional support or concern

    New Auto-Interp
    Negative Logits
    ãĥŃãĥ³
    -0.17
     dob
    -0.15
    erus
    -0.14
    æķ
    -0.14
    vious
    -0.14
    umps
    -0.14
     vÃŃ
    -0.14
    åĬ
    -0.14
     kalk
    -0.13
    AAAAAAAA
    -0.13
    POSITIVE LOGITS
    uard
    0.15
    anth
    0.15
    une
    0.14
    arf
    0.13
    /backend
    0.13
     chatte
    0.13
    HEST
    0.13
    OfWeek
    0.13
    pivot
    0.13
    нив
    0.13
    Act Density 0.635%

    No Known Activations