INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     intensified
    -0.08
     fels
    -0.08
     annoyance
    -0.07
     turbul
    -0.07
    .Field
    -0.07
     Tobias
    -0.07
    ło
    -0.07
    صف
    -0.07
     Printed
    -0.07
     proble
    -0.07
    POSITIVE LOGITS
     virar
    0.08
    (employee
    0.08
     образ
    0.08
     turning
    0.08
     الموظ
    0.08
    (turn
    0.08
     Turning
    0.08
     polícia
    0.08
     resultar
    0.07
     বিভিন্ন
    0.07
    Act Density 0.001%

    No Known Activations