INDEX
    Explanations

    expressions of frustration or difficulty with problem-solving

    New Auto-Interp
    Negative Logits
    nger
    -0.15
    achs
    -0.15
    ầm
    -0.15
    yscale
    -0.15
    ATRIX
    -0.14
    ÙĪÙħÛĮ
    -0.14
     Cousins
    -0.14
    .BLL
    -0.13
    996
    -0.13
    veis
    -0.13
    POSITIVE LOGITS
     nada
    0.25
     STILL
    0.19
     nothing
    0.19
     results
    0.19
     Still
    0.19
     success
    0.17
     luck
    0.17
     Nothing
    0.17
     still
    0.17
     improvement
    0.17
    Act Density 0.117%

    No Known Activations