INDEX
    Explanations

    expressions of opinion or feedback

    New Auto-Interp
    Negative Logits
    tera
    -0.18
    icie
    -0.15
    esan
    -0.15
    tero
    -0.15
    ongan
    -0.14
    vero
    -0.14
    abant
    -0.14
     Incontri
    -0.14
     Pok
    -0.14
     Ale
    -0.14
    POSITIVE LOGITS
     like
    0.47
    like
    0.33
     Like
    0.32
    Like
    0.31
    _like
    0.31
     likes
    0.30
     LIKE
    0.29
    .like
    0.27
     como
    0.26
     như
    0.26
    Act Density 0.037%

    No Known Activations