INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Labels
    -0.07
     sticky
    -0.07
     ankle
    -0.06
     Nico
    -0.06
     bunker
    -0.06
    aterno
    -0.06
    isOk
    -0.06
     decryption
    -0.06
    who
    -0.06
     curse
    -0.06
    POSITIVE LOGITS
    oundation
    0.07
    017
    0.07
    miss
    0.06
     Expected
    0.06
     nl
    0.06
    TextBox
    0.06
    0.06
    ��
    0.06
    -comments
    0.06
     Croatian
    0.06
    Act Density 0.012%

    No Known Activations