INDEX
    Explanations

    words related to social issues and activism such as legitimizing, tolerating, destabilizing, and revitalizing

    New Auto-Interp
    Negative Logits
    HOU
    -0.71
    cloth
    -0.64
     Meadows
    -0.62
    uden
    -0.60
    erity
    -0.60
    GAME
    -0.60
    words
    -0.59
    tower
    -0.56
     Ank
    -0.56
     Aad
    -0.56
    POSITIVE LOGITS
    ized
    2.52
    ization
    2.52
    izing
    2.48
    izations
    2.28
    izes
    2.24
    izers
    2.20
    isation
    2.17
    ize
    2.12
    ised
    2.11
    izer
    2.06
    Act Density 1.791%

    No Known Activations