INDEX
    Explanations

    expressions of emotional struggle and resilience

    New Auto-Interp
    Negative Logits
    tc
    -0.17
    etz
    -0.15
    emer
    -0.15
    yx
    -0.15
    abad
    -0.15
    upe
    -0.15
    διο
    -0.14
    ived
    -0.14
    flo
    -0.14
    ep
    -0.14
    POSITIVE LOGITS
     handle
    0.45
     handles
    0.38
     handling
    0.38
     tolerate
    0.38
     Handle
    0.37
     toler
    0.37
    handle
    0.37
    tol
    0.35
    .handle
    0.35
     Handling
    0.34
    Act Density 0.128%

    No Known Activations