INDEX
    Explanations

    concepts related to authority and judgment

    after commas, especially "of" and "the"

    New Auto-Interp
    Negative Logits
    awtextra
    -1.03
     мәкал
    -0.86
     للمعارف
    -0.82
     चीज़ों
    -0.81
    findpost
    -0.80
     >=",
    -0.79
     NSCoder
    -0.79
    ’”
    -0.78
    ’.”
    -0.76
    DockStyle
    -0.76
    POSITIVE LOGITS
     to
    0.79
     is
    0.78
    ,
    0.64
    -
    0.62
     of
    0.60
     in
    0.60
     are
    0.59
     --
    0.58
     –
    0.58
     can
    0.57
    Act Density 0.165%

    No Known Activations