INDEX
    Explanations

    section dividers or headings in the document

    New Auto-Interp
    Negative Logits
     T
    -0.70
    RetentionPolicy
    -0.64
    -0.63
     ch
    -0.63
    t
    -0.62
    -0.58
     t
    -0.57
     ab
    -0.57
     sub
    -0.57
     I
    -0.56
    POSITIVE LOGITS
    ==========
    1.19
     ------
    1.17
     ----------
    1.15
    ----------
    1.15
    ===========
    1.14
    ============
    1.14
    =============
    1.13
     --------
    1.11
     -----------
    1.11
    =========
    1.11
    Act Density 0.231%

    No Known Activations