INDEX
    Explanations

    phrases related to work and labor conditions

    phrases related to adverse effects and medical issues

    New Auto-Interp
    Negative Logits
    Truth
    -0.89
    erest
    -0.89
    nesday
    -0.88
     ðŁĻĤ
    -0.88
    ruary
    -0.87
    NAS
    -0.83
    soType
    -0.82
    fuck
    -0.82
    ultimate
    -0.82
     ðŁĺ
    -0.81
    POSITIVE LOGITS
     dozens
    1.07
     varying
    1.07
     makeshift
    1.06
     rudimentary
    1.05
     roadside
    1.05
     sophisticated
    1.00
     myriad
    0.99
     specialized
    0.97
     frequent
    0.97
     elaborate
    0.97
    Act Density 0.732%

    No Known Activations