INDEX
    Explanations

    terms and phrases related to social dialogues and discussions about gender norms in various contexts

    New Auto-Interp
    Negative Logits
    )";
    
    -1.20
    '},
    
    -1.10
    "},
    
    -1.04
    `,
    
    -1.03
    "],
    
    -1.01
    "])
    
    -1.01
    `;
    
    -0.99
    }")
    
    -0.99
    "];
    
    -0.99
    ',
    
    
    -0.99
    POSITIVE LOGITS
    Á
    0.53
     Nord
    0.52
    ú
    0.51
     Nor
    0.51
    awt
    0.51
     doubt
    0.48
    0.48
     Eagles
    0.48
    лу
    0.48
    BeginContext
    0.47
    Act Density 2.463%

    No Known Activations