INDEX
    Explanations

    references to opposition or dissent in political contexts

    New Auto-Interp
    Negative Logits
    igans
    -0.16
    ulner
    -0.15
    pent
    -0.15
    .Constant
    -0.15
     Rit
    -0.15
     sublicense
    -0.14
    ingle
    -0.14
    \<
    -0.14
    gere
    -0.14
    лÑĸв
    -0.14
    POSITIVE LOGITS
    andalone
    0.15
    جاد
    0.14
    éŃļ
    0.14
    eyse
    0.14
    EventManager
    0.14
    uger
    0.13
    "]."
    0.13
    ãĥ³ãĥģ
    0.13
     jadx
    0.13
    ãİ
    0.13
    Act Density 0.012%

    No Known Activations