INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -2.15
    guang
    -0.70
    sheng
    -0.66
    fillType
    -0.64
    qiao
    -0.61
    yao
    -0.61
    zhong
    -0.58
    springframework
    -0.57
    -0.56
    usercontent
    -0.56
    POSITIVE LOGITS
     affor
    1.51
     indestru
    1.48
     unwarran
    1.39
     maneu
    1.38
     LXXX
    1.36
     increa
    1.35
     scrat
    1.35
     perfet
    1.35
     excru
    1.35
     emphat
    1.34
    Act Density 0.484%

    No Known Activations