INDEX
Explanations
phrases related to social justice and inequality
New Auto-Interp
Negative Logits
朔
-0.54
estimés
-0.53
+#+#
-0.53
oplasma
-0.50
exclusivity
-0.50
Dernière
-0.49
chlag
-0.49
Necessity
-0.48
useDispatch
-0.48
tere
-0.48
POSITIVE LOGITS
InputDecoration
0.71
مرئيه
0.70
شاهد
0.58
inaction
0.57
stander
0.57
standers
0.56
Letting
0.55
passively
0.53
Letting
0.53
ALLOW
0.53
Activations Density 0.352%