INDEX
    Explanations

    the start of new sections or segments indicated by specific tokens

    Q followed by questions

    New Auto-Interp
    Negative Logits
     Efq
    -1.41
     purpoſe
    -1.32
     pleaſure
    -1.28
     myſelf
    -1.25
     raiſ
    -1.25
     houſe
    -1.25
     Reſ
    -1.23
     Jefus
    -1.23
     Majefty
    -1.23
     Anſ
    -1.22
    POSITIVE LOGITS
    <eos>
    0.71
     se
    0.56
    ↵↵
    0.56
     la
    0.52
    <strong>
    0.51
     m
    0.50
     za
    0.49
     ...
    0.49
    0.48
    0.48
    Act Density 0.009%

    No Known Activations