INDEX
Explanations
phrases related to injustice and its impacts on individuals or groups
New Auto-Interp
Negative Logits
Angiosper
-0.16
иÑĨин
-0.16
eton
-0.15
Auss
-0.15
.gb
-0.14
kyt
-0.14
_PHY
-0.14
enti
-0.14
åĺ
-0.13
krit
-0.13
POSITIVE LOGITS
impulse
0.15
ORIES
0.15
ULE
0.15
impres
0.14
upported
0.14
.dump
0.14
avage
0.14
avel
0.14
691
0.14
aku
0.13
Activations Density 0.135%