INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     conn
    -0.06
    -0.06
     posible
    -0.06
     قرارداد
    -0.06
    porter
    -0.06
     conseils
    -0.06
     Commonwealth
    -0.06
    ested
    -0.06
    şt
    -0.06
     plagiarism
    -0.06
    POSITIVE LOGITS
     reaff
    0.07
    maxcdn
    0.06
    inki
    0.06
    COME
    0.06
     choke
    0.06
     fired
    0.06
     Seller
    0.06
    тра
    0.06
     Mim
    0.06
     yayın
    0.06
    Act Density 0.047%

    No Known Activations