INDEX
    Explanations

    references to specific groups or categories of people

    references to specific groups of people or demographics

    New Auto-Interp
    Negative Logits
    Thumbnail
    -0.73
    strument
    -0.73
    ograph
    -0.67
    la
    -0.66
    ories
    -0.66
     Tropical
    -0.65
    Chain
    -0.65
    clusive
    -0.62
    rament
    -0.61
    rick
    -0.61
    POSITIVE LOGITS
     complain
    0.84
    selves
    0.82
     who
    0.82
    '
    0.79
    opausal
    0.76
     prefer
    0.74
     wishing
    0.70
     complained
    0.70
    folk
    0.69
     are
    0.69
    Act Density 0.421%

    No Known Activations