INDEX
    Explanations

    instances of hypocrisy in political and social contexts

    New Auto-Interp
    Negative Logits
    rouw
    -0.15
    _PK
    -0.14
    olla
    -0.14
    andro
    -0.14
    onomy
    -0.13
    otas
    -0.13
    ãĥ¼ãĥ©
    -0.13
    ).__
    -0.13
    OrFail
    -0.13
     Riders
    -0.13
    POSITIVE LOGITS
     whereas
    0.29
     despite
    0.27
     à¤ľà¤¬à¤ķ
    0.24
     while
    0.23
     Whereas
    0.22
    while
    0.20
     ØŃاÙĦÛĮ
    0.20
    ibel
    0.20
     whilst
    0.19
     without
    0.18
    Act Density 0.260%

    No Known Activations