INDEX
    Explanations

    phrases that highlight hypocrisy in social and political discourse

    New Auto-Interp
    Negative Logits
    conde
    -0.15
     Seg
    -0.15
    ÑĢави
    -0.15
    igar
    -0.15
    azzi
    -0.14
    .LOG
    -0.14
     Lund
    -0.14
    ãĥ¼ãĤº
    -0.14
    rone
    -0.14
    ogg
    -0.13
    POSITIVE LOGITS
    isz
    0.16
    algo
    0.14
    udder
    0.14
    ê³Ħ
    0.14
    едÑĮ
    0.14
     Dangerous
    0.14
    aida
    0.14
    -hash
    0.14
    ixin
    0.13
    ¨
    0.13
    Act Density 0.157%

    No Known Activations