INDEX
    Explanations

    colons and formatting indicators

    New Auto-Interp
    Negative Logits
     myſelf
    -4.29
     Efq
    -4.11
     itſelf
    -3.94
     Monfieur
    -3.90
     Theſe
    -3.85
     Jefus
    -3.74
     Majefty
    -3.62
     pleaſure
    -3.60
     ―――――
    -3.58
     ſeveral
    -3.53
    POSITIVE LOGITS
     .
    2.23
    ,
    2.11
     :
    2.10
    .
    2.09
    2.07
     (
    2.03
     ,
    1.87
    <eos>
    1.79
    ↵↵
    1.76
     in
    1.74
    Act Density 0.536%

    No Known Activations