INDEX
    Explanations

    phrases related to political opinions and actions, specifically with a focus on specific political figures or movements

    New Auto-Interp
    Negative Logits
    ibli
    -0.68
     fortunes
    -0.62
    ouf
    -0.60
    angu
    -0.59
    perty
    -0.58
    ounge
    -0.56
     Hist
    -0.54
     square
    -0.53
    MORE
    -0.53
    Shift
    -0.53
    POSITIVE LOGITS
     by
    1.03
     expressly
    0.84
     during
    0.84
     jointly
    0.82
     collabor
    0.81
     pursuant
    0.80
     instituted
    0.73
     artificially
    0.72
     unanimously
    0.72
     aback
    0.69
    Act Density 0.259%

    No Known Activations