INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     reasonable
    -0.08
    ظة
    -0.07
    ecause
    -0.07
    olidays
    -0.07
    896
    -0.07
     Rather
    -0.07
     زیرا
    -0.07
     because
    -0.06
    OUGH
    -0.06
    .documents
    -0.06
    POSITIVE LOGITS
    hendis
    0.09
     Hann
    0.09
     HUD
    0.09
     Hawk
    0.09
     Hed
    0.09
     Hammond
    0.09
     Huang
    0.08
     Helen
    0.08
     HA
    0.08
     Helm
    0.08
    Act Density 1.363%

    No Known Activations