INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     internships
    -0.08
     confort
    -0.08
    oys
    -0.08
     kolon
    -0.07
     collectiv
    -0.07
    ,sum
    -0.07
     коллектив
    -0.07
    exclusive
    -0.07
    =sum
    -0.07
     unrealistic
    -0.07
    POSITIVE LOGITS
     publishes
    0.09
     публика
    0.09
     julka
    0.09
     publier
    0.08
     published
    0.08
     veröff
    0.08
     permalink
    0.08
     публи
    0.08
     게시
    0.08
     permanence
    0.08
    Act Density 0.008%

    No Known Activations