INDEX
    Explanations

    mentions of effort and labor-related concepts

    New Auto-Interp
    Negative Logits
    ught
    -0.20
    idenav
    -0.16
    inne
    -0.16
    unya
    -0.14
    /by
    -0.14
    lendir
    -0.14
    ELLOW
    -0.14
    æĺŃ
    -0.13
    avian
    -0.13
    755
    -0.13
    POSITIVE LOGITS
     worked
    0.24
    -working
    0.23
    worked
    0.21
    working
    0.21
     working
    0.20
     Working
    0.20
    Working
    0.20
     out
    0.20
     toward
    0.19
     towards
    0.18
    Act Density 0.045%

    No Known Activations