INDEX
    Explanations

    words related to political and economic issues

    New Auto-Interp
    Negative Logits
    bender
    -0.78
    puff
    -0.71
    cum
    -0.70
    tar
    -0.69
    icter
    -0.66
    conom
    -0.66
    wic
    -0.65
    wrap
    -0.64
    ussen
    -0.64
    more
    -0.64
    POSITIVE LOGITS
    selves
    1.38
     own
    1.20
     ancestors
    1.02
     beloved
    1.00
     selves
    0.95
     ourselves
    0.93
     asses
    0.93
     adversaries
    0.92
     collective
    0.90
     hearts
    0.88
    Act Density 0.315%

    No Known Activations