INDEX
    Explanations

    terms and phrases denoting severity and issues related to governance or societal challenges

    New Auto-Interp
    Negative Logits
     ãĢĬ
    -0.16
    bdb
    -0.16
    affe
    -0.16
    loat
    -0.15
    agnost
    -0.15
    ropa
    -0.14
     @$_
    -0.14
    rana
    -0.14
    celik
    -0.14
    lander
    -0.13
    POSITIVE LOGITS
    aklı
    0.16
    []
    0.14
    882
    0.14
    usterity
    0.14
    astr
    0.14
    ym
    0.14
     ç¶
    0.14
    ÙĬÙĩ
    0.14
    ο
    0.13
     erót
    0.13
    Act Density 0.122%

    No Known Activations