INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    zee
    -0.07
    ertime
    -0.07
    atte
    -0.07
    放弃了
    -0.07
    ful
    -0.06
     предн
    -0.06
    Configuration
    -0.06
    不服
    -0.06
    adden
    -0.06
    ביניהם
    -0.06
    POSITIVE LOGITS
     Bishop
    0.08
    火焰
    0.07
     Acting
    0.07
     Potion
    0.07
     scam
    0.07
     drifted
    0.07
    0.07
     tooltip
    0.07
     tức
    0.07
     Trap
    0.06
    Act Density 0.004%

    No Known Activations