INDEX
    Explanations

    references to protests and social movements

    New Auto-Interp
    Negative Logits
    ģ
    -0.16
    403
    -0.16
     th
    -0.15
    oj
    -0.15
    ienia
    -0.15
     gy
    -0.15
    136
    -0.15
     commission
    -0.14
     deter
    -0.14
     en
    -0.14
    POSITIVE LOGITS
    sko
    0.24
    олоÑĪ
    0.23
    isti
    0.21
    iÄįka
    0.21
    SKI
    0.21
    iÄį
    0.20
    arna
    0.20
    иÑĩ
    0.19
    porno
    0.19
    Ñģки
    0.18
    Act Density 0.010%

    No Known Activations