INDEX
    Explanations

    mentions of different groups of people in various contexts, such as researchers, voters, consumers, individuals, players, Jews, and liberals

    references to groups of people or individuals in various contexts

    New Auto-Interp
    Negative Logits
    stars
    -0.64
    ces
    -0.61
     Shore
    -0.61
    ILE
    -0.61
    aughs
    -0.60
    Aw
    -0.60
    UV
    -0.59
    DOWN
    -0.58
    sie
    -0.58
    shows
    -0.58
    POSITIVE LOGITS
     are
    1.10
     aren
    1.03
     perceive
    0.98
     prefer
    0.97
     have
    0.97
     were
    0.95
     everywhere
    0.94
     crave
    0.93
     realize
    0.91
     weren
    0.91
    Act Density 0.294%

    No Known Activations