INDEX
    Explanations

    statements made by individuals in the context of discussions or reports on social issues

    New Auto-Interp
    Negative Logits
    raya
    -0.17
    uten
    -0.16
    utt
    -0.15
    окÑĥ
    -0.15
    missive
    -0.15
    rud
    -0.15
    leon
    -0.15
     voks
    -0.15
    vell
    -0.15
    gue
    -0.14
    POSITIVE LOGITS
    .glide
    0.14
    ark
    0.14
     Rh
    0.13
     standards
    0.13
    instein
    0.13
     ÑģкоÑĢ
    0.13
    akers
    0.13
     prince
    0.13
     rÄĥng
    0.12
    reate
    0.12
    Act Density 0.126%

    No Known Activations