INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Pay
    -0.08
     payoff
    -0.08
     pec
    -0.08
    iona
    -0.08
    uition
    -0.07
     gennaio
    -0.07
     Myn
    -0.07
    uek
    -0.07
     Bro
    -0.07
    ентов
    -0.07
    POSITIVE LOGITS
    σ
    0.08
     wounds
    0.08
    0.07
     sén
    0.07
    (tr
    0.07
     Fiat
    0.07
    ár
    0.07
     communes
    0.07
    0.07
    0.07
    Act Density 0.013%

    No Known Activations