INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sincere
    -0.07
     testcase
    -0.07
    -0.07
    usp
    -0.06
     wa
    -0.06
    -0.06
     Whisper
    -0.06
    -0.06
    能源
    -0.06
    -0.06
    POSITIVE LOGITS
     astronom
    0.06
    ологія
    0.06
     سرم
    0.06
    (button
    0.06
    Guess
    0.06
    _Char
    0.06
    {i
    0.06
     лог
    0.06
    #if
    0.05
    erer
    0.05
    Act Density 0.003%

    No Known Activations