INDEX
    Explanations

    the presence of specific formatting or symbols, particularly whitespace or empty characters, in the text

    New Auto-Interp
    Negative Logits
     themſelves
    -0.90
     himſelf
    -0.84
     myſelf
    -0.80
     itſelf
    -0.79
     springfox
    -0.78
     Shakspeare
    -0.73
    fromCharCode
    -0.68
     fubject
    -0.66
     pleaſure
    -0.66
     reaſon
    -0.65
    POSITIVE LOGITS
     the
    1.10
    <eos>
    0.76
    the
    0.75
     The
    0.70
     THE
    0.61
    ScopeManager
    0.61
    WindowConstants
    0.59
     its
    0.58
     that
    0.57
     our
    0.56
    Act Density 0.262%

    No Known Activations