INDEX
    Explanations

    words related to negative judgment of intelligence or actions

    the word "stupid" and its variations in various contexts

    New Auto-Interp
    Negative Logits
    AUT
    -0.93
    apers
    -0.83
     largeDownload
    -0.83
    APH
    -0.82
    accompan
    -0.76
    riott
    -0.75
    aver
    -0.75
    orthy
    -0.74
    rigan
    -0.73
    arnaev
    -0.71
    POSITIVE LOGITS
    nesses
    1.04
    ly
    0.95
    ness
    0.89
    itude
    0.81
    gery
    0.80
    glers
    0.77
     stupid
    0.77
    founded
    0.77
    ged
    0.71
     shit
    0.71
    Act Density 0.035%

    No Known Activations