INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Aaron
    -0.08
     homework
    -0.08
     surprising
    -0.07
     Leonardo
    -0.07
     Garner
    -0.07
    (nodes
    -0.07
    Assets
    -0.07
     zijn
    -0.07
     dai
    -0.07
     subsets
    -0.07
    POSITIVE LOGITS
     trimmed
    0.10
    长度
    0.09
    _lengths
    0.09
     Length
    0.08
    trap
    0.08
     microphone
    0.08
    linewidth
    0.08
    _trim
    0.08
    _len
    0.08
    Length
    0.08
    Act Density 0.006%

    No Known Activations