INDEX
    Explanations

    the beginning of new sections or paragraphs in a document

    New Auto-Interp
    Negative Logits
     Koz
    -0.99
    ^(@)
    -0.98
     لينك
    -0.92
    leſs
    -0.88
     numberWith
    -0.88
     rhestr
    -0.87
     XNUMX
    -0.86
    ufact
    -0.86
    NUMX
    -0.83
    =-=-=-=-
    -0.82
    POSITIVE LOGITS
    </sup>
    1.36
    </sub>
    1.22
    </u>
    1.20
    </em>
    1.07
    </s>
    1.06
    </i>
    0.99
    </code>
    0.91
    </strong>
    0.86
    <eos>
    0.83
    0.81
    Act Density 0.118%

    No Known Activations