INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ovich
    -0.19
    igu
    -0.18
    ortic
    -0.16
     McCart
    -0.16
    ioso
    -0.16
    isci
    -0.15
    ured
    -0.15
    icom
    -0.15
    nga
    -0.14
     ucfirst
    -0.14
    POSITIVE LOGITS
    ENER
    0.15
    SIZE
    0.14
    Datas
    0.14
    adlo
    0.13
    æİª
    0.13
     ke
    0.13
    istrovstvÃŃ
    0.13
    åIJ¹
    0.13
     desn
    0.13
    ernal
    0.13
    Act Density 0.009%

    No Known Activations