INDEX
    Explanations

    phrases related to societal issues and criticisms

    New Auto-Interp
    Negative Logits
    lov
    -0.18
    lok
    -0.16
    ieri
    -0.15
    že
    -0.14
    aya
    -0.14
     Zam
    -0.14
    ì´
    -0.14
     ilma
    -0.14
    juan
    -0.14
     ÎŃν
    -0.14
    POSITIVE LOGITS
    .scalablytyped
    0.19
    strup
    0.15
    uger
    0.14
     TKey
    0.14
     unt
    0.14
    eker
    0.14
    goog
    0.14
    ete
    0.14
    ugh
    0.13
    keterangan
    0.13
    Act Density 0.417%

    No Known Activations