INDEX
    Explanations

    Needs to be done

    New Auto-Interp
    Negative Logits
     want
    -0.09
     tasked
    -0.08
     invited
    -0.08
     unable
    -0.08
     envisage
    -0.08
     merkt
    -0.08
    named
    -0.08
     ਆਪਣੇ
    -0.08
     willing
    -0.08
     able
    -0.08
    POSITIVE LOGITS
     occur
    0.09
    展开
    0.09
     dauern
    0.09
     outweigh
    0.09
     funktionieren
    0.09
     funktioniert
    0.09
     berlangsung
    0.08
    的发展
    0.08
     gelesen
    0.08
     varier
    0.08
    Act Density 0.199%

    No Known Activations