INDEX
    Explanations

    mathematical expressions and symbols in the text

    New Auto-Interp
    Negative Logits
    "
    -0.58
    -
    -0.55
    .
    -0.52
    ...
    -0.52
    -0.51
    +
    -0.49
     P
    -0.47
    ,
    -0.45
    2
    -0.45
     te
    -0.45
    POSITIVE LOGITS
    ſelf
    1.02
     nahilalakip
    1.02
     myſelf
    0.93
     Reſ
    0.90
     itſelf
    0.90
    ſelves
    0.86
     defaultstate
    0.86
     GenerationType
    0.85
     neceff
    0.81
     raiſ
    0.80
    Act Density 0.407%

    No Known Activations