INDEX
    Explanations

    superscripts and numbers

    New Auto-Interp
    Negative Logits
    ){
    0.83
     (-\
    0.81
    <unused1057>
    0.80
    "){
    0.80
    underbrace
    0.77
    0.75
     \,
    0.75
    ynit
    0.74
    ọng
    0.74
     consequential
    0.73
    POSITIVE LOGITS
    </sup>
    1.01
    *
    0.95
    \
    0.92
    +
    0.90
    0.87
    </u>
    0.83
    ***
    0.83
    **
    0.82
    0.82
    0.80
    Act Density 0.015%

    No Known Activations