INDEX
    Explanations

    instances of emphasis or attention in a document

    New Auto-Interp
    Negative Logits
    â
    -0.65
    -0.64
    Ã
    -0.57
     â
    -0.57
    Â
    -0.53
    -0.52
     Â
    -0.52
    -0.47
    ď
    -0.47
     ``
    -0.45
    POSITIVE LOGITS
    💪
    0.76
     🥳
    0.76
    👀
    0.75
    🤔
    0.75
     👀
    0.74
     estekak
    0.74
    👉
    0.73
    🙏
    0.72
    🥳
    0.72
    🎉
    0.72
    Act Density 0.267%

    No Known Activations