INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     пош
    -0.07
     준비
    -0.07
     вд
    -0.06
     içer
    -0.06
     kar
    -0.06
    iêng
    -0.06
    -0.06
     vod
    -0.06
     conc
    -0.06
     мин
    -0.06
    POSITIVE LOGITS
    εις
    0.07
    Pat
    0.06
    skin
    0.06
    iker
    0.06
    Sa
    0.06
    _mex
    0.06
    bes
    0.06
    leşik
    0.06
    /write
    0.06
    numero
    0.06
    Act Density 0.002%

    No Known Activations