INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     that
    1.22
    that
    1.13
     to
    1.09
     aplatis
    1.05
    s
    1.05
     esposa
    1.03
     với
    1.02
     brasileiro
    1.01
    ");
    0.99
    ς
    0.98
    POSITIVE LOGITS
    ه
    1.60
    ف
    1.23
    ب
    1.22
    ان
    1.20
    in
    1.13
    ص
    1.12
    .
    1.10
    1.09
    an
    1.08
    u
    1.07
    Act Density 0.048%

    No Known Activations