INDEX
    Explanations

    phrases indicating a collection or a set of related items

    instances of the word "These," indicating a focus on referring to groups or collections of items or people

    New Auto-Interp
    Negative Logits
     planning
    -0.70
     wound
    -0.69
     ticket
    -0.63
     reflex
    -0.63
     status
    -0.63
     finished
    -0.62
     tele
    -0.62
     manager
    -0.62
     board
    -0.62
     sacked
    -0.61
    POSITIVE LOGITS
    These
    3.06
    these
    2.31
     These
    2.18
    Those
    1.88
     THESE
    1.86
    Such
    1.70
    This
    1.64
    Each
    1.52
    They
    1.51
    Both
    1.42
    Act Density 0.017%

    No Known Activations