INDEX
    Explanations

    references to social hierarchies and relationships

    New Auto-Interp
    Negative Logits
     kvinnor
    -0.67
     kvinder
    -0.67
     vrouwen
    -0.61
     девочки
    -0.61
     girls
    -0.59
     ženy
    -0.59
    PerformLayout
    -0.58
    Girls
    -0.57
     žena
    -0.57
     mujeres
    -0.57
    POSITIVE LOGITS
     spin
    0.56
    spin
    0.48
     courtes
    0.46
     virgin
    0.45
     dow
    0.45
     maiden
    0.45
     Spin
    0.43
     Amazon
    0.43
     ny
    0.42
     vir
    0.41
    Act Density 0.473%

    No Known Activations