INDEX
    Explanations

    special formatting or symbols in the text

    New Auto-Interp
    Negative Logits
    RegressionTest
    -1.10
     ſy
    -1.03
     myſelf
    -1.02
     purpoſe
    -1.01
     uſed
    -0.98
     pleaſure
    -0.94
     fevere
    -0.94
     preſent
    -0.94
     reaſon
    -0.93
     propOrder
    -0.92
    POSITIVE LOGITS
    s
    0.85
    </sub>
    0.72
    </i>
    0.70
     }}$
    0.69
    ̈
    0.68
     }}
    0.68
    /}
    0.67
    </em>
    0.67
    i
    0.66
    </sup>
    0.65
    Act Density 0.258%

    No Known Activations