INDEX
    Explanations

    conjunctions "or" with high activation values

    instances of the word "or."

    New Auto-Interp
    Negative Logits
    ires
    -0.73
    estamp
    -0.63
    ourn
    -0.61
    irlf
    -0.61
    ublic
    -0.59
    idem
    -0.57
    ascus
    -0.57
     Wast
    -0.57
    irms
    -0.57
    ights
    -0.56
    POSITIVE LOGITS
    ifice
    1.35
    Else
    1.31
    acles
    1.27
    acle
    1.25
    acular
    1.14
    chard
    1.10
    chid
    1.08
    nery
    1.04
     alternatively
    1.02
    ific
    1.00
    Act Density 0.192%

    No Known Activations