INDEX
Explanations
words related to discrimination, criticism, and social movements
New Auto-Interp
Negative Logits
ilogy
-0.73
mma
-0.69
ften
-0.68
ariat
-0.67
undrum
-0.66
agame
-0.64
THERE
-0.64
inator
-0.64
iphate
-0.63
ocene
-0.63
POSITIVE LOGITS
situations
0.86
amounts
0.85
occasions
0.82
quantities
0.79
proportions
0.78
levels
0.77
objects
0.76
threats
0.76
periods
0.75
parts
0.75
Activations Density 12.822%