INDEX
    Explanations

    instructions or descriptions related to personal care and hygiene

    New Auto-Interp
    Negative Logits
     Mortal
    -0.74
    rian
    -0.72
    ravel
    -0.71
    REDACTED
    -0.65
    */(
    -0.61
    CHA
    -0.59
    sup
    -0.59
    ivist
    -0.58
     hemor
    -0.57
    elist
    -0.57
    POSITIVE LOGITS
    robe
    1.10
     curtain
    0.99
    tub
    0.98
     curtains
    0.94
    bed
    0.94
    ing
    0.93
     showers
    0.91
    atur
    0.89
    ysis
    0.86
     shower
    0.86
    Act Density 0.026%

    No Known Activations