INDEX
Explanations
phrases related to fairness and justice
concepts related to fairness and justice
New Auto-Interp
Negative Logits
apse
-0.81
hent
-0.75
CHAT
-0.67
ember
-0.63
OUS
-0.63
Underground
-0.63
acid
-0.61
OPLE
-0.60
artifacts
-0.60
uality
-0.60
POSITIVE LOGITS
yt
1.32
grounds
1.05
fair
0.87
ground
0.83
trade
0.80
eport
0.79
iciary
0.76
ies
0.75
parency
0.74
compensation
0.73
Activations Density 0.052%