INDEX
    Explanations

    phrases indicating acting in someone's interest or on someone's behalf

    phrases that emphasize representation and advocacy on behalf of others

    New Auto-Interp
    Negative Logits
    unct
    -0.62
    gr
    -0.60
    olic
    -0.59
    alach
    -0.59
    cer
    -0.57
    fitting
    -0.57
    pol
    -0.57
    aucus
    -0.57
     Topic
    -0.56
     Okin
    -0.56
    POSITIVE LOGITS
    steps
    0.84
     selves
    0.68
    agents
    0.68
    ivas
    0.65
    farious
    0.64
    ombat
    0.64
    ²¾
    0.63
    stretched
    0.63
     endeavors
    0.63
    issance
    0.62
    Act Density 0.160%

    No Known Activations