INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    plus
    -0.07
     quant
    -0.07
    zipcode
    -0.07
     bes
    -0.07
    -info
    -0.07
     CONS
    -0.07
     subject
    -0.07
    exp
    -0.06
    law
    -0.06
     packageName
    -0.06
    POSITIVE LOGITS
    0.29
    0.24
    ,每
    0.20
    0.14
    0.11
    0.10
    0.09
    ,这
    0.09
    0.08
    0.08
    Act Density 0.004%

    No Known Activations