INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     assumptions
    -0.66
     corrid
    -0.66
     retina
    -0.66
     whisk
    -0.66
     mant
    -0.63
     empowered
    -0.62
     exemptions
    -0.62
     palate
    -0.61
     optimistic
    -0.59
     obsolete
    -0.58
    POSITIVE LOGITS
    soever
    0.87
    nian
    0.87
    hyde
    0.84
    wm
    0.84
    chev
    0.82
    tch
    0.81
    hex
    0.80
    Edit
    0.77
    sum
    0.76
    Ever
    0.75
    Act Density 0.066%

    No Known Activations