INDEX
    Explanations

    patterns of words indicating comparisons or references to groups

    New Auto-Interp
    Negative Logits
    umpt
    -0.17
    globals
    -0.15
    -li
    -0.15
    ollar
    -0.15
    ÄĻk
    -0.15
    loh
    -0.14
    orra
    -0.14
    aeda
    -0.14
    queues
    -0.14
     èĢ
    -0.14
    POSITIVE LOGITS
     us
    0.28
     them
    0.20
    .us
    0.17
    ender
    0.16
    aze
    0.15
    (us
    0.14
    ssi
    0.14
    Us
    0.14
    igin
    0.14
    455
    0.14
    Act Density 0.064%

    No Known Activations