INDEX
    Explanations

    phrases related to fairness and justice

    New Auto-Interp
    Negative Logits
    apse
    -0.86
    CHAT
    -0.83
    hent
    -0.76
    OPLE
    -0.74
    Assembly
    -0.68
    uality
    -0.68
    acid
    -0.65
    OUS
    -0.63
    artifacts
    -0.62
    hal
    -0.61
    POSITIVE LOGITS
    yt
    1.15
    grounds
    1.07
    fair
    1.02
    itably
    0.88
    iciary
    0.87
    ground
    0.85
     compensation
    0.78
    child
    0.77
    trade
    0.75
     fair
    0.73
    Act Density 0.642%

    No Known Activations