INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (/[
    -0.07
     grain
    -0.06
     кисл
    -0.06
    εί
    -0.06
    ующ
    -0.06
     kapsamında
    -0.06
    -0.06
     genel
    -0.06
     dikke
    -0.06
     yıl
    -0.06
    POSITIVE LOGITS
    _comb
    0.07
    eligible
    0.07
    Thank
    0.06
     Thank
    0.06
     Zaman
    0.06
     horror
    0.06
     sealed
    0.06
     تمامی
    0.06
     angry
    0.06
     ^=
    0.06
    Act Density 0.036%

    No Known Activations