INDEX
    Explanations

    references to statistical or evaluative statements regarding social issues

    New Auto-Interp
    Negative Logits
    his
    -0.22
     his
    -0.18
    uzzi
    -0.17
    ä»ĸçļĦ
    -0.17
     ê·¸ìĿĺ
    -0.17
    è¾ħ
    -0.15
    ÃľR
    -0.15
    strcasecmp
    -0.15
    lán
    -0.15
     HIS
    -0.14
    POSITIVE LOGITS
     he
    0.28
    ä»ĸ
    0.21
     он
    0.20
     He
    0.19
     she
    0.19
     вÑĸн
    0.18
     reference
    0.17
    He
    0.17
     point
    0.17
     Echo
    0.17
    Act Density 0.166%

    No Known Activations