INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    fw
    -0.75
    whe
    -0.68
    Limited
    -0.66
    iverse
    -0.66
     Became
    -0.65
    Published
    -0.63
    erity
    -0.63
    bard
    -0.63
    oji
    -0.62
    iol
    -0.62
    POSITIVE LOGITS
     previous
    1.04
     traditional
    1.02
     typical
    1.01
     usual
    0.98
     conventional
    0.96
     counterparts
    0.93
     ours
    0.89
     predecessors
    0.89
     other
    0.83
    lihood
    0.83
    Act Density 1.009%

    No Known Activations