INDEX
    Explanations

    phrases related to societal and political criticisms

    New Auto-Interp
    Negative Logits
    orex
    -0.17
     повеÑĢÑħ
    -0.15
    oho
    -0.15
    ilst
    -0.14
    ÐĽÐ¬
    -0.14
    .throw
    -0.14
    earch
    -0.14
    velt
    -0.14
    ued
    -0.14
    ila
    -0.13
    POSITIVE LOGITS
    apgolly
    0.14
    аÐ
    0.14
    Ïħγ
    0.14
     Authentic
    0.13
    åŁº
    0.13
    *)((
    0.13
     Ging
    0.12
    .='<
    0.12
     conclusions
    0.12
     Yuk
    0.12
    Act Density 0.414%

    No Known Activations