INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pany
    -0.08
    -0.08
    -0.07
     gioco
    -0.07
    -0.07
     schaffen
    -0.07
     businessman
    -0.07
    ੂਰ
    -0.07
     xpos
    -0.07
     schafft
    -0.07
    POSITIVE LOGITS
     sonrası
    0.09
     عض
    0.09
     gym
    0.08
    cation
    0.08
     imper
    0.08
    0.08
    0.08
     dosis
    0.08
     gyms
    0.08
     prescription
    0.07
    Act Density 0.008%

    No Known Activations