INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    قرار
    -0.09
    -0.08
     ix
    -0.07
     hassle
    -0.07
    czę
    -0.07
    -0.07
     Premium
    -0.07
     bolo
    -0.07
    -0.07
     Tie
    -0.07
    POSITIVE LOGITS
     fashioned
    0.08
     forgotten
    0.08
    0.08
     hoped
    0.08
     Robbins
    0.07
     federation
    0.07
     Shirley
    0.07
     auparavant
    0.07
    ,b
    0.07
     Hun
    0.07
    Act Density 0.094%

    No Known Activations