INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    emplo
    -0.07
     plu
    -0.07
     Paste
    -0.07
     produtos
    -0.06
     cheaper
    -0.06
     poo
    -0.06
    \e
    -0.06
     waste
    -0.06
     зменш
    -0.06
     stud
    -0.06
    POSITIVE LOGITS
     any
    0.16
     Any
    0.12
     ANY
    0.12
    Any
    0.12
    any
    0.09
    Every
    0.09
    -any
    0.09
    ANY
    0.08
     Every
    0.08
    .Any
    0.08
    Act Density 0.109%

    No Known Activations