INDEX
    Explanations

    phrases that criticize political figures or actions

    New Auto-Interp
    Negative Logits
     للاسماء
    -0.82
     '\\;'
    -0.79
     >=",
    -0.73
    Clik
    -0.72
     يتيمه
    -0.70
     jScrollPane
    -0.69
    Referencie
    -0.68
     الحره
    -0.67
    DockStyle
    -0.67
     pinulongan
    -0.66
    POSITIVE LOGITS
    🏻
    0.58
     raider
    0.56
     provoc
    0.54
     fakes
    0.50
    ρέπει
    0.48
    livejournal
    0.48
     demonstr
    0.47
     bander
    0.47
    hype
    0.46
    remia
    0.46
    Act Density 0.231%

    No Known Activations