INDEX
    Explanations

    expressions related to personal interests and passions

    New Auto-Interp
    Negative Logits
    AxisAlignment
    -0.55
     שוליים
    -0.54
     становника
    -0.53
     ImGui
    -0.52
    Hentet
    -0.51
     ivelany
    -0.49
    protoc
    -0.49
    twimg
    -0.49
    الدراسه
    -0.49
    ImGui
    -0.48
    POSITIVE LOGITS
     passion
    2.19
     love
    1.82
    passion
    1.80
     passione
    1.73
     Passion
    1.68
     pasión
    1.67
    Passion
    1.65
     paixão
    1.65
     passionate
    1.64
     passions
    1.62
    Act Density 0.275%

    No Known Activations