INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ineries
    -0.08
     самая
    -0.08
     آج
    -0.08
     zgod
    -0.08
     dy
    -0.08
     вполне
    -0.07
     nale
    -0.07
     rr
    -0.07
    perts
    -0.07
     стать
    -0.07
    POSITIVE LOGITS
     champion
    0.11
    Champion
    0.10
     champions
    0.10
    champ
    0.09
     campeón
    0.09
     Champions
    0.09
    apus
    0.08
    ampion
    0.08
    0.08
     vocal
    0.08
    Act Density 0.009%

    No Known Activations