INDEX
    Explanations

    various forms of formatting or markup elements within the text

    New Auto-Interp
    Negative Logits
    -0.93
    ».
    -0.84
    -0.83
    »,
    -0.80
    …).
    -0.77
    …»
    -0.75
     «
    -0.73
    «.
    -0.73
     ».
    -0.72
    »:
    -0.69
    POSITIVE LOGITS
    ----------------
    2.16
    --------------
    1.36
    ---------------
    1.34
    -------------
    1.33
    ------------
    1.32
    -----------
    1.28
    ----------
    1.26
    --------
    1.26
    ------
    1.24
    ---------
    1.22
    Act Density 0.308%

    No Known Activations