INDEX
    Explanations

    instances of the word "or."

    New Auto-Interp
    Negative Logits
    ires
    -0.70
    Ident
    -0.65
    IDENT
    -0.64
    ETS
    -0.62
    efer
    -0.62
    IGHTS
    -0.60
     Pony
    -0.60
    mitter
    -0.59
    moil
    -0.59
    EMP
    -0.58
    POSITIVE LOGITS
    nam
    1.22
    acle
    1.19
    chard
    1.17
    chid
    1.12
    ifice
    1.09
    acles
    1.08
    Else
    1.02
    nery
    0.99
    ific
    0.99
     otherwise
    0.97
    Act Density 0.180%

    No Known Activations