INDEX
    Explanations

    phrases indicating ambition or dedication to achieving goals

    New Auto-Interp
    Negative Logits
    atee
    -0.15
    /up
    -0.15
    most
    -0.15
    weg
    -0.15
    xit
    -0.14
    k
    -0.14
    nee
    -0.14
    ega
    -0.14
    /by
    -0.14
    ne
    -0.14
    POSITIVE LOGITS
     harder
    0.25
     towards
    0.24
     toward
    0.23
     hardest
    0.23
    -hard
    0.22
    Towards
    0.20
     hard
    0.19
    hard
    0.19
     Towards
    0.18
     HARD
    0.18
    Act Density 0.009%

    No Known Activations