INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Expert
    -0.09
     expert
    -0.08
     голов
    -0.08
     big
    -0.08
     sredstva
    -0.08
     محور
    -0.08
    .je
    -0.08
     paperwork
    -0.08
     пита
    -0.08
     gewährleisten
    -0.07
    POSITIVE LOGITS
     testify
    0.08
     moons
    0.08
    internal
    0.08
     chur
    0.08
    Lucas
    0.08
    msgs
    0.07
    Msgs
    0.07
     disputes
    0.07
     Galileo
    0.07
    -this
    0.07
    Act Density 0.003%

    No Known Activations