INDEX
    Explanations

    terms related to political and social issues, particularly those affecting specific groups or regions

    New Auto-Interp
    Negative Logits
     Dud
    -0.15
    ç²ī
    -0.14
    .documents
    -0.13
     usu
    -0.13
    151
    -0.13
    bedo
    -0.13
    æī¶
    -0.12
    cancellationToken
    -0.12
    IBLE
    -0.12
    989
    -0.12
    POSITIVE LOGITS
    ur
    0.91
    UR
    0.80
     ur
    0.77
     Ur
    0.73
    Ur
    0.69
     UR
    0.68
    ÑĥÑĢ
    0.63
    _ur
    0.63
    .ur
    0.61
    urs
    0.60
    Act Density 0.267%

    No Known Activations