INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ori
    -0.07
    uno
    -0.07
    wij
    -0.06
    hwnd
    -0.06
    رانی
    -0.06
    opoulos
    -0.06
     renegot
    -0.06
    -key
    -0.06
    Iran
    -0.06
    еров
    -0.06
    POSITIVE LOGITS
     notifies
    0.07
     göre
    0.07
     Intervention
    0.06
     Became
    0.06
     stopped
    0.06
     fired
    0.06
     někdy
    0.06
     firing
    0.06
    fft
    0.06
     Thank
    0.06
    Act Density 0.037%

    No Known Activations