INDEX
    Explanations

    mathematical and latex expressions

    New Auto-Interp
    Negative Logits
    
    1.01
          
    1.00
    0.99
    -*
    0.98
    0.98
    .**
    0.96
    ***
    0.96
    λ
    0.95
    **
    0.94
    µ
    0.93
    POSITIVE LOGITS
     \
    1.73
    ^{\
    1.32
     \%$
    1.19
     \%$.
    1.18
     \%$,
    1.14
     \;
    1.12
     \%
    1.09
     \,
    1.08
    ^{
    1.08
    +\
    1.07
    Act Density 0.254%

    No Known Activations