INDEX
    Explanations

    Numbers enclosed in dollar signs

    New Auto-Interp
    Negative Logits
    -0.88
    -0.71
    <h2>
    -0.65
    </h3>
    -0.64
    -0.64
    </h2>
    -0.63
    </h4>
    -0.63
    <sup>
    -0.61
    ©
    -0.61
    </b>
    -0.60
    POSITIVE LOGITS
     myſelf
    0.98
    $.
    
    0.95
     Efq
    0.94
     \\
    
    0.94
     themſelves
    0.93
     $_"
    0.93
     ſhe
    0.89
    ſelves
    0.87
    ^(@)
    0.87
     himſelf
    0.86
    Act Density 0.054%

    No Known Activations