INDEX
    Explanations

    phrases related to expectations and societal norms

    New Auto-Interp
    Negative Logits
    avra
    -0.16
    á»ĵi
    -0.16
     пÑĢидеÑĤÑģÑı
    -0.15
    kil
    -0.14
    znik
    -0.14
    otal
    -0.14
    aal
    -0.14
    orial
    -0.14
    tility
    -0.13
    _EXISTS
    -0.13
    POSITIVE LOGITS
     supposed
    1.05
     suppose
    0.81
     meant
    0.63
     supposedly
    0.51
     purported
    0.44
     Suppose
    0.43
     alleged
    0.43
     SUP
    0.43
     intended
    0.41
     allegedly
    0.40
    Act Density 0.259%

    No Known Activations