INDEX
    Explanations

    references to societal norms and constructs involving identity, belonging, and the consequences of beliefs

    New Auto-Interp
    Negative Logits
    MBER
    -0.14
     ArgumentNullException
    -0.13
     XCTAssertTrue
    -0.13
     coppia
    -0.13
    abus
    -0.13
    اض
    -0.13
    essim
    -0.13
    zin
    -0.13
    ISIBLE
    -0.13
    ÎķÎł
    -0.13
    POSITIVE LOGITS
     non
    1.10
     Non
    0.96
    non
    0.92
    Non
    0.91
     NON
    0.90
    éĿŀ
    0.86
    -non
    0.84
    _non
    0.82
    .non
    0.78
    (non
    0.77
    Act Density 0.264%

    No Known Activations