INDEX
    Explanations

    phrases related to doing work, often of a physical or laborious nature

    phrases related to undesirable or unpleasant tasks

    New Auto-Interp
    Negative Logits
    urated
    -0.70
     Sett
    -0.69
    oor
    -0.68
     Bound
    -0.68
     Arri
    -0.68
     Continued
    -0.67
     Courage
    -0.67
    ãĤ¦ãĤ¹
    -0.67
     Returning
    -0.66
    eming
    -0.66
    POSITIVE LOGITS
     grunt
    0.89
     chores
    0.83
     homework
    0.80
    strip
    0.75
     differently
    0.74
     offline
    0.74
     rehab
    0.71
     migrate
    0.68
     thing
    0.68
     experiment
    0.68
    Act Density 0.256%

    No Known Activations