INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     advisable
    -0.09
    باشد
    -0.08
    ETY
    -0.08
     Saf
    -0.08
     AUT
    -0.08
     ramifications
    -0.08
     ماد
    -0.08
    、副
    -0.07
    rebbero
    -0.07
     اللازمة
    -0.07
    POSITIVE LOGITS
     expects
    0.10
    expects
    0.09
     expertise
    0.08
     exige
    0.08
     practitioners
    0.08
     stringent
    0.08
     emphasis
    0.08
     reject
    0.08
     experts
    0.08
    强调
    0.07
    Act Density 0.099%

    No Known Activations