INDEX
    Explanations

    phrases indicating certainty or emphasis

    New Auto-Interp
    Negative Logits
     🤣🤣
    -0.71
     😍😍
    -0.71
    🥲
    -0.70
     unil
    -0.69
     🥲
    -0.69
     calvin
    -0.68
     Simult
    -0.68
    🙃
    -0.68
     😭😭
    -0.67
    ☺☺
    -0.66
    POSITIVE LOGITS
     definitely
    0.75
    Definitely
    0.74
    <bos>
    0.71
    definitely
    0.69
     Definitely
    0.68
     definite
    0.62
    expandindo
    0.61
     definitiv
    0.52
    definite
    0.50
     definately
    0.49
    Act Density 0.097%

    No Known Activations