INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    advisor
    -0.07
    	Output
    -0.07
    ’.
    -0.07
    subtract
    -0.07
     reson
    -0.06
     contest
    -0.06
     tv
    -0.06
    importe
    -0.06
     anz
    -0.06
    ”).
    -0.06
    POSITIVE LOGITS
     BAB
    0.07
    いか
    0.06
     Wil
    0.06
    eği
    0.06
     Esp
    0.06
     Kun
    0.06
     Meeting
    0.06
     ули
    0.06
    UTE
    0.06
     Αλ
    0.06
    Act Density 0.004%

    No Known Activations