INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .offer
    -0.07
    Sunday
    -0.06
    .contents
    -0.06
     Sunday
    -0.06
     приб
    -0.06
     shocks
    -0.06
    .↵
    -0.06
    šť
    -0.06
    forcement
    -0.06
     ********************************
    -0.06
    POSITIVE LOGITS
    writer
    0.07
     kenn
    0.07
    ,ID
    0.06
    'util
    0.06
    领导
    0.06
    άνι
    0.06
    department
    0.06
    rewrite
    0.06
     hydro
    0.06
     italiani
    0.06
    Act Density 0.006%

    No Known Activations