INDEX
    Explanations

    positive conversational emojis

    New Auto-Interp
    Negative Logits
    📠
    0.93
    0.92
    0.91
    🕤
    0.91
    🕝
    0.90
    0.89
    📙
    0.89
    📔
    0.88
    🕴
    0.88
    🕣
    0.87
    POSITIVE LOGITS
    1.15
    ↵↵
    1.07
    !
    1.04
    K
    1.00
    1.00
    Why
    0.96
    Oh
    0.92
     ف
    0.91
    k
    0.90
    Fighting
    0.87
    Act Density 0.384%

    No Known Activations