INDEX
    Explanations

    expressions of willingness to assist or provide support

    New Auto-Interp
    Negative Logits
    اÙĦÙĬا
    -0.17
    iedo
    -0.16
    iye
    -0.15
    usta
    -0.15
    IRM
    -0.15
    lect
    -0.15
    gunakan
    -0.14
    pest
    -0.14
    allah
    -0.14
    еÑĢе
    -0.14
    POSITIVE LOGITS
     happy
    0.55
     Happy
    0.49
    happy
    0.48
    Happy
    0.46
     happiness
    0.43
     HAPP
    0.43
     Happiness
    0.35
     happier
    0.32
     happ
    0.31
     happiest
    0.30
    Act Density 0.077%

    No Known Activations