INDEX
    Explanations

    terms related to political and social issues

    New Auto-Interp
    Negative Logits
    ORB
    -0.15
    ált
    -0.15
    ables
    -0.15
    erez
    -0.15
    isma
    -0.14
    ird
    -0.14
    iates
    -0.14
    ive
    -0.14
    edula
    -0.14
    (er
    -0.14
    POSITIVE LOGITS
     speaking
    0.38
    -speaking
    0.33
     sound
    0.30
     Speaking
    0.28
     minded
    0.26
    sound
    0.25
    spe
    0.25
    Speaking
    0.25
     challenged
    0.21
     SOUND
    0.21
    Act Density 0.052%

    No Known Activations