INDEX
    Explanations

    the beginning of new sections or topics in the text

    New Auto-Interp
    Negative Logits
     surla
    -0.98
    niſſe
    -0.97
     queſta
    -0.96
    iſen
    -0.90
    mpagne
    -0.90
    ValueStyle
    -0.90
    iffance
    -0.88
    ſicht
    -0.86
    ſcher
    -0.86
    <unused14>
    -0.85
    POSITIVE LOGITS
    <td>
    0.37
    But
    0.36
    <sub>
    0.34
    I
    0.33
    Q
    0.33
    <em>
    0.32
    <strong>
    0.31
    <code>
    0.31
    expect
    0.31
    2
    0.31
    Act Density 0.065%

    No Known Activations