INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     untreated
    -0.09
     emis
    -0.09
     depositing
    -0.08
    'b
    -0.08
    Unt
    -0.08
     invest
    -0.08
     brook
    -0.08
     stati
    -0.07
     prescr
    -0.07
     Byr
    -0.07
    POSITIVE LOGITS
     интерф
    0.09
     авт
    0.09
     торм
    0.09
    -equipped
    0.08
     системы
    0.08
     bediening
    0.08
     autofocus
    0.08
     vertra
    0.08
    -enabled
    0.07
     haut
    0.07
    Act Density 0.004%

    No Known Activations