INDEX
    Explanations

    words related to social issues and policy, particularly with a critical or contentious tone

    negative or critical descriptors associated with various subjects

    New Auto-Interp
    Negative Logits
    iage
    -0.72
    urtles
    -0.71
    ggies
    -0.67
     Horizons
    -0.66
    ateurs
    -0.65
    eeks
    -0.65
    iets
    -0.64
    regor
    -0.64
    ynthesis
    -0.63
     Jackets
    -0.62
    POSITIVE LOGITS
    -)
    0.90
    -
    0.90
    ]-
    0.83
    istic
    0.82
    functional
    0.82
    )-
    0.81
    '-
    0.80
    ocratic
    0.79
    -[
    0.77
    "-
    0.75
    Act Density 0.330%

    No Known Activations