INDEX
    Explanations

    comparative language emphasizing differences and similarities

    New Auto-Interp
    Negative Logits
    ı
    -0.16
    udas
    -0.15
    å·»
    -0.15
    arsing
    -0.15
    arkin
    -0.14
     å½±
    -0.14
     Dominion
    -0.14
    HDR
    -0.14
    Domin
    -0.14
    omi
    -0.14
    POSITIVE LOGITS
     spins
    0.17
     Lifetime
    0.16
     spin
    0.16
    auc
    0.15
    spin
    0.15
    Lifetime
    0.15
    mere
    0.15
    icode
    0.15
     Nice
    0.15
     Spin
    0.14
    Act Density 0.285%

    No Known Activations