INDEX
Explanations
phrases related to fairness and justice
phrases related to fairness and social equity
New Auto-Interp
Negative Logits
orio
-0.58
onto
-0.54
idi
-0.52
requires
-0.52
arta
-0.51
fet
-0.49
aka
-0.49
arget
-0.49
psey
-0.48
onga
-0.48
POSITIVE LOGITS
last
0.59
yesterday
0.58
originally
0.57
initially
0.55
earlier
0.53
terday
0.50
Yanuk
0.49
beforehand
0.49
enthusi
0.48
iscons
0.48
Activations Density 2.973%