INDEX
    Explanations

    expressions of preference or affection

    Comes before opinions or preferences

    New Auto-Interp
    Negative Logits
     endwhile
    -0.49
    بوابة
    -0.48
    <bos>
    -0.47
     breathtaking
    -0.45
     devastating
    -0.45
     breakthroughs
    -0.43
     سكانية
    -0.42
     Burnham
    -0.42
    ^
    -0.41
    lastonbury
    -0.41
    POSITIVE LOGITS
    Dislikes
    0.67
     InputDecoration
    0.66
     liked
    0.64
    Liked
    0.61
     Liked
    0.60
    dislike
    0.59
    liked
    0.59
     Likes
    0.56
    Likes
    0.54
     gusta
    0.54
    Act Density 0.064%

    No Known Activations