INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     joy
    -0.07
     rahatsız
    -0.07
    763
    -0.07
    (fontSize
    -0.06
    ź
    -0.06
    thin
    -0.06
    ImGui
    -0.06
    _utc
    -0.06
     몸을
    -0.06
     Lebens
    -0.06
    POSITIVE LOGITS
     cane
    0.12
     cess
    0.07
    consts
    0.06
    636
    0.06
     cass
    0.06
     (){↵
    0.06
     Candy
    0.06
     Κά
    0.06
     fiss
    0.06
     енерг
    0.06
    Act Density 0.001%

    No Known Activations