INDEX
Explanations
mentions of the concept of fairness or fair treatment
occurrences of the word "fair" and its variations
New Auto-Interp
Negative Logits
uality
-0.78
Ion
-0.70
Immunity
-0.69
Emer
-0.65
Hobby
-0.64
Armory
-0.64
è¦ļéĨĴ
-0.63
Biological
-0.62
ulse
-0.61
ODUCT
-0.61
POSITIVE LOGITS
faire
1.26
sts
0.90
iour
0.88
luaj
0.87
rant
0.83
iciary
0.83
rates
0.82
dylib
0.82
hest
0.82
rants
0.82
Activations Density 0.009%