INDEX
    Explanations

    specific phrases or constructs related to conditional or regulatory language

    New Auto-Interp
    Negative Logits
     myſelf
    -1.15
     pleaſure
    -0.97
    ſelf
    -0.96
     whoſe
    -0.94
     Houſe
    -0.93
     Majefty
    -0.93
     Theſe
    -0.92
     ―――――
    -0.92
     itſelf
    -0.91
     Monfieur
    -0.91
    POSITIVE LOGITS
    <bos>
    2.45
    '
    1.20
    1.10
    1.07
    "
    0.98
    )
    0.92
    0.90
    ",
    0.89
    ',
    0.87
    0.87
    Act Density 2.648%

    No Known Activations