INDEX
    Explanations

    words that express strong emotional states or conflicts

    New Auto-Interp
    Negative Logits
    父子
    -0.79
    的男人
    -0.78
     muž
    -0.75
     pria
    -0.71
    👬
    -0.70
     męski
    -0.70
    Mr
    -0.69
     masculina
    -0.69
     mannen
    -0.69
     brotherhood
    -0.68
    POSITIVE LOGITS
     woman
    1.78
     lady
    1.69
     women
    1.63
     female
    1.58
    woman
    1.56
     Woman
    1.53
     girl
    1.45
    women
    1.45
    Woman
    1.43
     Women
    1.42
    Act Density 1.162%

    No Known Activations