INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ılığı
    -0.11
     هو
    -0.09
    ened
    -0.09
    ening
    -0.08
    ωνα
    -0.08
    endeu
    -0.08
     qilib
    -0.08
    -0.08
     الر
    -0.08
    ენი
    -0.08
    POSITIVE LOGITS
    Comb
    0.09
    Log
    0.08
     lib
    0.08
    Geo
    0.08
    Free
    0.08
    801
    0.07
    Lib
    0.07
    Refund
    0.07
    irar
    0.07
    Capt
    0.07
    Act Density 0.001%

    No Known Activations