INDEX
    Explanations

    instances of words related to promotions or the utilization of resources or information for specific purposes

    themes related to power dynamics and social issues

    New Auto-Interp
    Negative Logits
    essing
    -0.58
    "],"
    -0.57
    OSED
    -0.57
    osion
    -0.56
    icable
    -0.55
    layer
    -0.55
    ategory
    -0.54
    retty
    -0.54
    én
    -0.52
    sonian
    -0.52
    POSITIVE LOGITS
     sparing
    1.17
     wisely
    1.10
     to
    1.02
     interchange
    0.99
     extensively
    0.97
     pseudonym
    0.96
     as
    0.83
     instead
    0.83
     metaphor
    0.81
     inappropriately
    0.81
    Act Density 0.363%

    No Known Activations