INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.06
     الدولة
    -0.06
    erge
    -0.05
     Vale
    -0.05
     bele
    -0.05
     jente
    -0.05
    erg
    -0.05
    Ba
    -0.05
    галі
    -0.05
     ragazza
    -0.05
    POSITIVE LOGITS
     chuck
    0.09
    TL
    0.07
    Chuck
    0.07
    .Once
    0.07
    cery
    0.07
     Chuck
    0.07
     امن
    0.07
    references
    0.07
     charge
    0.07
     click
    0.07
    Act Density 0.002%

    No Known Activations