INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    598
    -0.06
     süt
    -0.06
     paint
    -0.06
     advisors
    -0.06
    ixture
    -0.06
     initial
    -0.06
    ITO
    -0.06
     escol
    -0.06
     killings
    -0.06
     OW
    -0.06
    POSITIVE LOGITS
    0.07
     крас
    0.06
    Southern
    0.06
     Override
    0.06
    0.06
     bothering
    0.06
    0.06
     Ф
    0.06
    0.06
     postId
    0.06
    Act Density 0.007%

    No Known Activations