INDEX
    Explanations

    references to social issues and the implications of actions on society

    New Auto-Interp
    Negative Logits
    ersen
    -0.19
    readcr
    -0.18
    rych
    -0.16
    ÑĢоÑĩ
    -0.14
    ocks
    -0.14
    enario
    -0.14
    plr
    -0.14
    /inet
    -0.14
    zych
    -0.14
     addCriterion
    -0.13
    POSITIVE LOGITS
     indeed
    0.17
     ÙĪØ£ÙĨ
    0.15
    far
    0.14
     Batt
    0.14
    asil
    0.13
    oplast
    0.13
    ossa
    0.13
    arend
    0.13
    oop
    0.13
    лага
    0.12
    Act Density 1.094%

    No Known Activations