INDEX
    Explanations

    instances of the word "work" and its variations, indicating a focus on effort and productivity

    New Auto-Interp
    Negative Logits
     Seks
    -0.15
    ught
    -0.14
    ads
    -0.14
    еÑĢÑĪ
    -0.14
     Baldwin
    -0.14
    #ad
    -0.14
    elan
    -0.14
    gone
    -0.13
     Herman
    -0.13
    çŁ¢
    -0.13
    POSITIVE LOGITS
     harder
    0.19
     magic
    0.18
    magic
    0.17
     hardest
    0.17
     Magic
    0.16
     hard
    0.16
    åĿĬ
    0.16
    人åĵ¡
    0.16
    Magic
    0.15
     out
    0.15
    Act Density 0.057%

    No Known Activations