INDEX
    Explanations

    expressions of emotional states or feelings

    New Auto-Interp
    Negative Logits
    ule
    -0.15
    ula
    -0.14
    esan
    -0.14
    pg
    -0.14
    isu
    -0.14
    .simple
    -0.14
     Mary
    -0.14
     unh
    -0.13
    aga
    -0.13
     L
    -0.13
    POSITIVE LOGITS
     like
    0.26
    åĥıæĺ¯
    0.19
    _like
    0.18
     như
    0.18
    ingly
    0.18
    Like
    0.18
     Like
    0.18
     LIKE
    0.17
     seperti
    0.16
    bern
    0.16
    Act Density 0.023%

    No Known Activations