INDEX
    Explanations

    addressing an issue

    New Auto-Interp
    Negative Logits
     myſelf
    -1.30
     itſelf
    -1.25
     Efq
    -1.16
    ſelves
    -1.07
     Jefus
    -1.07
     himſelf
    -1.05
     themſelves
    -1.05
    NUMX
    -1.05
     Theſe
    -1.04
     Cæsar
    -1.03
    POSITIVE LOGITS
     the
    0.90
     some
    0.68
     this
    0.66
     all
    0.66
    0.65
     their
    0.60
     a
    0.59
     and
    0.59
    (
    0.59
    ,
    0.59
    Act Density 0.067%

    No Known Activations