INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.06
    تو
    -0.06
     prop
    -0.06
     концеп
    -0.06
     reput
    -0.06
    -0.06
     resist
    -0.06
    _"
    -0.06
     adjective
    -0.06
    <Integer
    -0.06
    POSITIVE LOGITS
    0.07
     impover
    0.07
    pike
    0.06
    0.06
    EXT
    0.06
     розви
    0.06
    ıyor
    0.06
     khả
    0.06
    =\
    0.06
     şu
    0.06
    Act Density 0.007%

    No Known Activations