INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     takeover
    -0.07
    blocking
    -0.06
    75
    -0.06
    .collection
    -0.06
     servisi
    -0.06
     embedded
    -0.06
     decides
    -0.06
     TAR
    -0.06
     jist
    -0.06
    ascript
    -0.06
    POSITIVE LOGITS
     صاد
    0.07
     horny
    0.07
    0.07
     loại
    0.07
    achinery
    0.06
    rophy
    0.06
     Machinery
    0.06
    Donate
    0.06
     Wales
    0.06
    WG
    0.06
    Act Density 0.018%

    No Known Activations