INDEX
    Explanations

    words related to societal issues, oppression, and political commentary

    New Auto-Interp
    Negative Logits
    cloth
    -0.77
    BOOK
    -0.72
    book
    -0.66
    friend
    -0.64
    manship
    -0.62
    lihood
    -0.62
    words
    -0.62
    GAME
    -0.60
    soDeliveryDate
    -0.60
    rooms
    -0.59
    POSITIVE LOGITS
    ized
    2.22
    ization
    2.18
    izing
    2.13
    istic
    1.99
    ize
    1.88
    ism
    1.86
    izes
    1.85
    ists
    1.84
    isation
    1.84
    istically
    1.82
    Act Density 3.624%

    No Known Activations