INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     plaque
    -0.08
     maz
    -0.08
     Pier
    -0.08
    REV
    -0.08
    PCB
    -0.08
    rev
    -0.07
     Martha
    -0.07
     bab
    -0.07
    Maz
    -0.07
     Maggie
    -0.07
    POSITIVE LOGITS
    worthiness
    0.08
    -catching
    0.08
    .threshold
    0.07
    收益
    0.07
    'ident
    0.07
     leng
    0.07
     vine
    0.07
     optimism
    0.07
     philanth
    0.07
     celu
    0.07
    Act Density 0.019%

    No Known Activations