INDEX
    Explanations

    punctuation marks and exclamation points

    New Auto-Interp
    Negative Logits
    _TA
    -0.14
     davran
    -0.13
    ÑĢÑĥÑĪ
    -0.12
     minul
    -0.12
    output
    -0.12
    виÑĤ
    -0.12
    â̦)↵↵
    -0.12
    module
    -0.12
    SupportedContent
    -0.12
    data
    -0.12
    POSITIVE LOGITS
    heck
    0.27
    "
    0.23
    yeah
    0.22
    hey
    0.20
    ya
    0.20
    oh
    0.20
    ugh
    0.20
    okay
    0.20
    (
    0.20
    UGH
    0.19
    Act Density 0.182%

    No Known Activations