INDEX
    Explanations

    mentions of social issues and the representation of marginalized voices

    New Auto-Interp
    Negative Logits
    ÑĪÑĤов
    -0.15
     Glover
    -0.15
    'gc
    -0.15
    ÑĸйÑģ
    -0.14
    egt
    -0.14
    grily
    -0.14
    ornings
    -0.13
    vent
    -0.13
    λιά
    -0.13
    oader
    -0.13
    POSITIVE LOGITS
     receives
    0.35
     receive
    0.35
     receiving
    0.31
     Receive
    0.28
    receive
    0.27
    Receive
    0.25
     received
    0.24
     RECEIVE
    0.23
     Rece
    0.23
     recibir
    0.23
    Act Density 0.198%

    No Known Activations