INDEX
    Explanations

    words related to medical conditions, especially the adjective "sick" with varying intensities

    references to the word "sick" in various contexts

    New Auto-Interp
    Negative Logits
     unlaw
    -0.78
     rul
    -0.71
     compr
    -0.69
    merce
    -0.67
     unden
    -0.65
     sanctioned
    -0.65
     Unch
    -0.65
    ETHOD
    -0.63
     principals
    -0.61
    NPR
    -0.61
    POSITIVE LOGITS
    ening
    1.35
    ened
    1.26
    bay
    1.16
    nesses
    0.98
    er
    0.93
    ly
    0.92
    ness
    0.90
    le
    0.88
    ert
    0.87
    igan
    0.87
    Act Density 0.021%

    No Known Activations