INDEX
    Explanations

    enumerated items or lists

    New Auto-Interp
    Negative Logits
     A
    -0.70
    -0.68
     (
    -0.67
     in
    -0.66
     it
    -0.64
     if
    -0.64
     let
    -0.63
     we
    -0.61
     I
    -0.60
     des
    -0.59
    POSITIVE LOGITS
     myſelf
    1.41
    ſelf
    1.38
     متعلقه
    1.34
    ſelves
    1.33
     purpoſe
    1.31
     leſs
    1.29
     Efq
    1.27
     auffi
    1.27
     ſta
    1.27
    AndEndTag
    1.26
    Act Density 0.030%

    No Known Activations