INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    plates
    -0.69
    ELD
    -0.68
    currency
    -0.68
    grounds
    -0.67
    venue
    -0.67
    iday
    -0.64
    DIT
    -0.63
    assed
    -0.61
     cleaner
    -0.60
    prof
    -0.60
    POSITIVE LOGITS
     underestimate
    1.19
     forget
    1.17
     hesitate
    1.16
     worry
    1.12
     bother
    1.09
     confuse
    1.05
     misunderstand
    0.93
     fret
    0.91
     expect
    0.88
     Forget
    0.88
    Act Density 0.052%

    No Known Activations