INDEX
    Explanations

    phrases related to body image and self-acceptance

    New Auto-Interp
    Negative Logits
    arius
    -0.16
    rien
    -0.15
    aire
    -0.15
     Cone
    -0.14
    vez
    -0.14
     finger
    -0.14
    berry
    -0.13
    ux
    -0.13
     Rod
    -0.13
    кав
    -0.13
    POSITIVE LOGITS
     parts
    0.20
     temple
    0.19
     Parts
    0.18
    mind
    0.18
    Parts
    0.18
    parts
    0.18
     functions
    0.18
    .react
    0.17
    guards
    0.17
    -functions
    0.16
    Act Density 0.041%

    No Known Activations