INDEX
    Explanations

    instances of the word "exception" or its variations in text

    references to exceptions and deviations from rules or norms

    New Auto-Interp
    Negative Logits
     destro
    -0.70
     rall
    -0.62
     goodbye
    -0.60
    ching
    -0.60
     pestic
    -0.60
     irrad
    -0.59
    roying
    -0.58
    raz
    -0.57
    sonian
    -0.57
     Beet
    -0.57
    POSITIVE LOGITS
    arily
    0.88
    ional
    0.84
    perty
    0.82
    ality
    0.82
    ĸļ
    0.81
    als
    0.79
    rules
    0.79
    izzle
    0.78
    alties
    0.76
     exceptions
    0.72
    Act Density 0.034%

    No Known Activations