INDEX
    Explanations

    phrases related to societal expectations and criticisms of societal norms

    New Auto-Interp
    Negative Logits
    eldorf
    -0.15
    avra
    -0.15
    andi
    -0.14
    undan
    -0.14
    olla
    -0.14
    likelihood
    -0.13
    znik
    -0.13
    _EXISTS
    -0.13
    CRET
    -0.13
    idak
    -0.13
    POSITIVE LOGITS
     supposed
    0.99
     suppose
    0.75
     meant
    0.58
     supposedly
    0.58
     purported
    0.52
     alleged
    0.47
     allegedly
    0.45
     intended
    0.43
     Suppose
    0.39
     SUP
    0.37
    Act Density 0.329%

    No Known Activations