INDEX
    Explanations

    emotional and relational dynamics, particularly around loss and caring actions

    New Auto-Interp
    Negative Logits
     praticamente
    -0.70
    Darn
    -0.70
     Aufgrund
    -0.68
     sumamente
    -0.68
    viamente
    -0.67
     äußerst
    -0.65
     extremadamente
    -0.64
     Asimismo
    -0.63
    )!
    -0.63
     např
    -0.63
    POSITIVE LOGITS
     fucking
    0.71
    fucking
    0.66
    noons
    0.66
    genstein
    0.65
     fucked
    0.64
    fuck
    0.62
     stillness
    0.58
     fuck
    0.58
     nameless
    0.56
    رشف
    0.55
    Act Density 0.660%

    No Known Activations