INDEX
    Explanations

    expressions of preference or enjoyment

    New Auto-Interp
    Negative Logits
     Hernandez
    -0.75
     Asher
    -0.71
    èvre
    -0.66
    اهرة
    -0.63
     Fernández
    -0.62
     brazos
    -0.62
    Gem
    -0.62
    <h6>
    -0.62
    PathVariable
    -0.61
    -0.60
    POSITIVE LOGITS
     Likes
    1.13
     liked
    1.12
     Liked
    1.05
     liking
    1.04
     likes
    1.03
    Likes
    1.01
     Lik
    0.97
    likes
    0.96
    dislike
    0.95
    liked
    0.90
    Act Density 0.056%

    No Known Activations