INDEX
    Explanations

    acknowledging informal or confrontational inputs

    New Auto-Interp
    Negative Logits
    "!
    0.52
     편리
    0.50
     wonderful
    0.48
    '!
    0.47
     exciting
    0.45
     Exc
    0.45
    !।
    0.44
    ”!
    0.44
    Exc
    0.43
    답니다
    0.43
    POSITIVE LOGITS
     idk
    1.02
     tbh
    0.96
     lmao
    0.94
     shit
    0.91
     honestly
    0.88
     dude
    0.88
     fucked
    0.85
     weird
    0.83
     shitty
    0.82
     Fuck
    0.82
    Act Density 0.010%

    No Known Activations