INDEX
    Explanations

    terms related to societal issues and conflicts

    phrases with high-frequency conjunctions and the word "and."

    New Auto-Interp
    Negative Logits
    IRE
    -0.72
    etting
    -0.64
    united
    -0.58
    arov
    -0.56
    fold
    -0.54
    GO
    -0.54
    OGR
    -0.54
    rison
    -0.54
    HEAD
    -0.53
    REE
    -0.53
    POSITIVE LOGITS
     the
    1.03
    the
    0.93
     });
    0.67
     Cth
    0.67
     its
    0.67
     THE
    0.66
     those
    0.65
    ata
    0.65
     whoever
    0.62
     The
    0.62
    Act Density 0.521%

    No Known Activations