INDEX
    Explanations

    terms related to power structures and their effects on society

    New Auto-Interp
    Negative Logits
    tsy
    -0.17
    odash
    -0.17
    utsch
    -0.17
    tls
    -0.17
    tember
    -0.17
    (s
    -0.17
    tep
    -0.17
    tridge
    -0.16
    placer
    -0.16
    togroup
    -0.16
    POSITIVE LOGITS
    es
    1.40
    (es
    0.76
    esin
    0.61
    ES
    0.59
    s
    0.59
    eses
    0.57
    ses
    0.53
    esModule
    0.49
    'es
    0.49
    ness
    0.48
    Act Density 0.299%

    No Known Activations