INDEX
    Explanations

    expressions of liking or positive sentiment

    New Auto-Interp
    Negative Logits
     Audiodateien
    -0.73
    abestanden
    -0.68
    脚注の使い方
    -0.66
    المناصب
    -0.63
    PreExecute
    -0.60
     riguarda
    -0.59
    <bos>
    -0.57
    \{\\
    -0.56
     spécialement
    -0.55
    Cyfarwyddwr
    -0.55
    POSITIVE LOGITS
     liked
    0.69
     loved
    0.61
     aimez
    0.61
     Loved
    0.60
     liking
    0.59
    loved
    0.59
    achusetts
    0.57
    artamento
    0.57
     favorite
    0.56
     favored
    0.54
    Act Density 0.007%

    No Known Activations