INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    xfe
    -0.07
     Merr
    -0.06
     Jake
    -0.06
     Mun
    -0.06
     dou
    -0.06
     vengeance
    -0.06
    /Peak
    -0.06
     Sem
    -0.06
    Search
    -0.06
     jim
    -0.06
    POSITIVE LOGITS
    odel
    0.07
    0.07
    ;↵
    0.07
     circulating
    0.07
    olding
    0.07
     round
    0.07
    old
    0.07
    Bright
    0.07
    -profit
    0.07
    ất
    0.07
    Act Density 0.017%

    No Known Activations