INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     NP
    -0.07
     Medicine
    -0.06
     qualitative
    -0.06
     funct
    -0.06
     Tb
    -0.06
    great
    -0.06
    uencia
    -0.06
     Wein
    -0.06
     erfolgre
    -0.05
     Neck
    -0.05
    POSITIVE LOGITS
    SignUp
    0.07
     lối
    0.07
     следующ
    0.07
     ire
    0.06
    0.06
    ?>'
    0.06
    Heading
    0.06
    _decay
    0.06
     Hire
    0.06
    0.06
    Act Density 0.006%

    No Known Activations