INDEX
    Explanations

    phrases related to campaigns or initiatives

    references to specific campaigns or initiatives, particularly those related to societal issues

    New Auto-Interp
    Negative Logits
     Naz
    -0.66
     cruelty
    -0.61
     casualties
    -0.59
     labels
    -0.59
     spo
    -0.59
     refuge
    -0.58
     Integrity
    -0.57
     details
    -0.57
     frames
    -0.57
     spoilers
    -0.57
    POSITIVE LOGITS
    arter
    4.80
    arters
    2.60
    arts
    1.29
    ART
    1.17
    arty
    1.13
    eret
    1.06
    RNA
    1.03
    arth
    0.99
    arted
    0.99
    amber
    0.95
    Act Density 0.012%

    No Known Activations