INDEX
    Explanations

    variants of the word "or" in various contexts

    New Auto-Interp
    Negative Logits
     NEVER
    -0.17
     neither
    -0.17
     Never
    -0.16
    Never
    -0.16
     ç¦
    -0.15
    Neither
    -0.15
     Neither
    -0.15
    never
    -0.15
    _NE
    -0.14
    inski
    -0.14
    POSITIVE LOGITS
     not
    0.48
    not
    0.38
     Not
    0.35
    Not
    0.34
    _not
    0.28
    .not
    0.27
     otherwise
    0.26
    	not
    0.26
     NOT
    0.25
    -not
    0.24
    Act Density 0.051%

    No Known Activations