INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    endiz
    -0.08
    plane
    -0.07
    nay
    -0.07
     gau
    -0.07
     Stella
    -0.07
     Gore
    -0.07
    antage
    -0.07
    ibb
    -0.07
    -0.07
     Weld
    -0.07
    POSITIVE LOGITS
    办法
    0.10
     regard
    0.09
     friction
    0.08
     sacrificing
    0.08
     resort
    0.08
     reinvent
    0.08
     Pit
    0.08
     respecto
    0.07
     caring
    0.07
    akin
    0.07
    Act Density 0.038%

    No Known Activations