INDEX
    Explanations

    phrases related to power dynamics and potential conflict

    phrases or sentences that emphasize contrasts or contradictions

    New Auto-Interp
    Negative Logits
    UF
    -0.81
    oe
    -0.67
    cia
    -0.66
    une
    -0.65
    orn
    -0.65
    uchin
    -0.64
    uy
    -0.63
    izen
    -0.61
    ļéĨĴ
    -0.61
    interstitial
    -0.61
    POSITIVE LOGITS
     however
    1.43
     though
    1.31
     albeit
    1.20
     meanwhile
    1.09
     huh
    1.08
     although
    1.03
     but
    0.97
     eh
    0.94
     namely
    0.90
     moreover
    0.88
    Act Density 0.706%

    No Known Activations