INDEX
    Explanations

    acknowledging conversational turns

    New Auto-Interp
    Negative Logits
     Mesmo
    0.74
    💔
    0.74
     speculated
    0.66
     Blogging
    0.66
     lmao
    0.66
     blogger
    0.64
     bloggers
    0.63
    bardziej
    0.63
    💕
    0.63
    💘
    0.63
    POSITIVE LOGITS
     now
    0.97
     ahora
    0.90
    现在
    0.77
     agora
    0.76
     şimdi
    0.73
    Now
    0.71
    それでは
    0.70
     first
    0.70
     сейчас
    0.70
    겠습니다
    0.70
    Act Density 0.384%

    No Known Activations