INDEX
    Explanations

    female family and roles

    New Auto-Interp
    Negative Logits
     boyhood
    0.65
    ‍♂️
    0.58
     യുവാ
    0.54
     Businessman
    0.54
     remaster
    0.53
    youtu
    0.52
     Technician
    0.52
     Corrosion
    0.51
     инженер
    0.49
    வன்
    0.49
    POSITIVE LOGITS
    👭
    0.82
     meninas
    0.79
     sisters
    0.75
    姐妹
    0.75
     mamá
    0.73
     actresses
    0.73
     princesses
    0.73
     엄마
    0.72
     niñas
    0.71
     cantik
    0.71
    Act Density 0.023%

    No Known Activations