INDEX
    Explanations

    work, improvement, planning

    New Auto-Interp
    Negative Logits
     herd
    -0.08
     dará
    -0.08
     ş
    -0.08
     dira
    -0.08
     da
    -0.08
     weak
    -0.07
     şek
    -0.07
     kuwa
    -0.07
     Ш
    -0.07
     isi
    -0.07
    POSITIVE LOGITS
     Musical
    0.09
     gotten
    0.08
     musical
    0.08
     Doesn
    0.08
    worked
    0.08
     Mus
    0.07
     достой
    0.07
    ős
    0.07
    ochond
    0.07
    ,col
    0.07
    Act Density 1.324%

    No Known Activations