INDEX
    Explanations

    domination and submission

    New Auto-Interp
    Negative Logits
    ي
    1.32
    an
    1.13
    tive
    1.12
    1.09
     السبب
    1.09
    不然
    1.08
     عاوز
    1.08
    turb
    1.06
     isso
    1.06
    elephant
    1.06
    POSITIVE LOGITS
    1.16
    0.93
     duly
    0.92
     Countries
    0.91
    0.90
    seits
    0.90
    ləşdir
    0.89
    窿
    0.89
     italics
    0.88
     traigo
    0.88
    Act Density 0.065%

    No Known Activations